LLMs with 100k Tokens

What LLM are reasoning with 100k token count?

Several large language models (LLMs) are capable of processing context windows of up to 100,000 tokens, enabling them to handle extensive documents and complex tasks. Notable examples include:

– **Claude 2 by Anthropic**: Released in July 2023, Claude 2 allows users to input up to 100,000 tokens per prompt, facilitating the analysis of lengthy texts such as technical documentation or entire books. citeturn0search7

– **GPT-4**: OpenAI’s GPT-4 has demonstrated the ability to manage context windows exceeding 100,000 tokens, enhancing its performance in tasks requiring the processing of large volumes of text. citeturn0search2

– **Kimi-Chat**: Developed by AI, Kimi-Chat is another proprietary model capable of handling over 100,000 tokens, contributing to advancements in processing extensive textual data. citeturn0search2

In the open-source domain, models such as YaRN and Yi-200K have been developed to handle context windows of 100,000 tokens or more, expanding the accessibility of high-capacity LLMs. citeturn0search2

These advancements in LLMs with extended context windows enhance their ability to perform complex reasoning and process substantial amounts of information efficiently.
What wouldI need to do to increase the token length of a Deepseek R1 model
The DeepSeek-R1 model is designed with a substantial context window of up to 128,000 tokens, allowing it to process extensive inputs. citeturn0search5 However, the maximum length for generated outputs is typically set to 32,768 tokens. citeturn0search8

To adjust the token limits for your specific use case, consider the following steps:

1. **API Configuration**: If you’re utilizing the DeepSeek API, you can modify the `max_tokens` parameter in your API requests to control the length of the generated output. Ensure that the value set does not exceed the model’s maximum generation limit. citeturn0search6

2. **Local Deployment**: For those running the model locally, particularly through platforms like Ollama, it’s possible to adjust the context length settings within the configuration files. This involves modifying specific parameters to increase the context window, thereby allowing the model to handle longer inputs. Detailed guidance on this process is available in community resources. citeturn0search0

3. **Model Quantization**: If hardware limitations are a concern, employing quantized versions of the model can reduce memory requirements, facilitating the handling of larger context windows. For instance, dynamically quantized models can significantly decrease the model size, making them more accessible for local deployment. citeturn0search2

By implementing these adjustments, you can effectively increase the token length capacity of the DeepSeek-R1 model to better suit your application’s needs.