I've never heard of NetHack. What is it?

NetHack is an open-source, roguelike video game first released in 1987. The game is procedurally generated, features ASCII graphics, and is notoriously difficult to play. The source code is written in C.

In 2020, Küttler et al. released the NetHack Learning Environment (NLE), wrapping the video game into a reinforcement learning (RL) environment for ML/AI research. This environment was presented at NeurIPS 2020.

A year later at NeurIPS 2021, Hambro et al. organized a competition centered on NetHack, dubbed "The NetHack Challenge Competition" (NHC). This competition spurred AI researchers, machine learning enthusiasists, and members of the community at large into benchmarking and developing neural and symbolic methods in NLE. Symbolic methods beat out neural ones in average in-game score by a staggering margin, leaving purely data-driven neural policies in the dust (Figure 1).

You can read more about NetHack via the game homepage or by checking out the community-maintained NetHack Wiki.

Why NetHack?

1. NetHack remains unsolved.

2. Due to its procedurally generated nature, NetHack truly probes policy generalization. Each random seed of the game features a completely unique layout of dungeons, monster encounters, and items to gather.

3. NetHack is compiled in C, yielding blazingly fast simulation. Training neural policies in NLE with RL remains tractable despite the high complexity of the game.

4. HiHack. Unlike other open-ended RL environments like Habitat, MineCraft, or AI2-Thor, the state-of-the-art (SOTA) artificial agent in NetHack is an open-source, hard-coded, symbolic, and hierarchical bot, which we "hack" to generate hierarchically labeled demonstration data. HiHack provides the community with a unique opportunity to explore the impact of ground-truth hierarchical behavioral priors on learning, in a data-unlimited setting.

How large is HiHack exactly? In games? Keypresses? Gigabytes?

HiHack contains 109,907 games and 3,244,729,367 keypresses (Table 1). After extraction to ttyrec4.bz2 files, the full dataset is 99 GB.

Despite its scale, neural policies trained on HiHack with imitation learning fail to match the bot in NLE score, even when RL finetuning is thrown into the mix. In the purely offline regime, neural policy architectures based on LSTMs and even transformers (Figure 2) exhibit sub log-linear scaling in NLE score with demonstration count (Piterbarg et al., 2023).

What data format is HiHack saved in?

HiHack is saved in the ttyrec data format native to NetHack.

How does HiHack compare to other offline datasets for NetHack?

See Table 1 for a comparison of HiHack to the AutoAscend NetHack Learning Dataset (NLD-AA), the latter containing demonstrations with keypress labels only.

How was HiHack generated?

HiHack was generated through the introduction of explicit "strategy" tracking to the AutoAscend source code and to the ttyrec read/write code of NLE. We open-source all code employed for data generation.

How well do state-of-the-art algorithms and models perform in NetHack when pre-trained on HiHack?

As of October 30, 2023, the SOTA for data-driven neural policies in NLE is set by agents trained on HiHack, both in the purely offline and offline + online settings (Piterbarg et al., 2023). See Table 2 below and our paper for more details.

I have more questions. Who should I contact?

Please send any inqueries to up2021 -at- cims.nyu.edu .

FAQs