LLM Web Interaction

Please write a script in python that will display the interaction with this llm in a simple browser. from transformers import LlamaTokenizer, LlamaForCausalLM
import torch

tokenizer = LlamaTokenizer.from_pretrained(“Xianjun/PLLaMa-7b-base”)
model = LlamaForCausalLM.from_pretrained(“Xianjun/PLLaMa-7b-base”).half().to(“cuda”)

instruction = “How to …”
batch = tokenizer(instruction, return_tensors=”pt”, add_special_tokens=False).to(“cuda”)
with torch.no_grad():
output = model.generate(**batch, max_new_tokens=512, temperature=0.7, do_sample=True)
response = tokenizer.decode(output[0], skip_special_tokens=True)

To display the interaction with the LLM in a simple browser interface, you can use `Flask` for the backend and `HTML` for the front-end. Below is a simple script that integrates your code into a web application:

1. Install Flask if you haven’t already:

“`bash
pip install flask
“`

2. Create the following Python script for the backend (`app.py`):

“`python
from flask import Flask, render_template, request
from transformers import LlamaTokenizer, LlamaForCausalLM
import torch

app = Flask(__name__)

# Load model and tokenizer
tokenizer = LlamaTokenizer.from_pretrained(“Xianjun/PLLaMa-7b-base”)
model = LlamaForCausalLM.from_pretrained(“Xianjun/PLLaMa-7b-base”).half().to(“cuda”)

@app.route(“/”, methods=[“GET”, “POST”])
def index():
response = “”
if request.method == “POST”:
instruction = request.form.get(“instruction”)
if instruction:
# Tokenize input and generate response
batch = tokenizer(instruction, return_tensors=”pt”, add_special_tokens=False).to(“cuda”)
with torch.no_grad():
output = model.generate(**batch, max_new_tokens=512, temperature=0.7, do_sample=True)
response = tokenizer.decode(output[0], skip_special_tokens=True)
return render_template(“index.html”, response=response)

if __name__ == “__main__”:
app.run(debug=True)
“`

3. Create an `index.html` file inside a `templates` folder in the same directory as `app.py`. This file will be used to render the user interface in the browser:

“`html

LLM Interaction

Interact with the LLM

Response:

{{ response }}

“`

### How It Works:
– The backend script (`app.py`) uses Flask to serve a simple web application.
– The frontend (`index.html`) provides an input field for users to submit text.
– When the user submits the text, the backend processes the input with the `LlamaForCausalLM` model, generates a response, and displays it in the browser.

### Running the Application:
1. Ensure you have both the `app.py` and `index.html` files correctly set up.
2. Run the script with the following command:

“`bash
python app.py
“`

3. Visit `http://127.0.0.1:5000/` in your browser to interact with the model.

The browser will display the input field and show the response after you submit an instruction.