If, in the course of playing the game, the model starts probing the game's parser for vulnerabilities or displays an unusual amount of interest in paperclips, please turn it off immediately.Colossal Cave Adventure brings to Windows the game that started it all - the game that pioneered the computer adventure game genre. This Python version is probably the simplest way to incorporate the game into GPT scripts. The closest thing to a modern "official" version I'm aware of is this C port of the original code. Unless someone has a strong reason to think that the task is significantly easier or harder on some particular version of the game, any version of Colossal Cave Adventure that is convenient to interface with the model is acceptable. But, since the goal is to test the capabilities of models trained on general knowledge, I think it's probably simplest to keep this a prompt engineering challenge and disallow training custom models for it. The goal here is to test the capabilities of an unaided neural network.ĭespite the title, I don't particularly mind if someone does this on some other GPT-like LLM instead, as long as it can be guaranteed that it isn't making use of web search or other external APIs. The model should not make use of external API calls or plug-ins. Just no giving it extra hints specific to the current game state. On the other hand, it's fine if, in order to save tokens, each "turn" of the prompt incorporates simple state information like the current inventory or the name of the current room, as long as it's information that would be readily accessible to a human player through in-game commands. Essentially, it should be possible to write an automated script that shuffles data from the game runtime into GPT and back, without a human involved in the process. Similar to the Sudoku market, the prompting strategy should consist only of some amount of introductory material, and then a series of fixed "turns" where the only text that varies is that produced by the game itself. This market will resolve YES if someone posts in the comments a way of prompting a GPT or GPT-like model to produce commands in the game that result in 5 treasures being returned to the start location before the close date, and NO otherwise. So for purposes of this market, let's define playing the game "well" to mean finding and returning at least 5 treasures. Granted, a lot of the puzzles in the game are pretty obscure, and most human players would probably struggle to play the game to completion. Since it's a text-only game with no graphics to interpret and no real-time decision-making, an LLM capable of world-modeling and physical reasoning should have a vastly easier time playing it than playing a bunch of Atari games, right? Right? (The test data in this repository includes logs from representative playthroughs.) There are many versions of the game available today, but the most common one has a total 15 treasures to find. The goal is to pick up treasures found in the cave and bring them back to the starting room. It has the player explore a virtual cave full of treasure and hazards by entering commands of 1-2 words into a simple text parser. Colossal Cave Adventure was an early text adventure game originally written in 1976.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |