I've posted my question at Facebook Research, but got no solutions from the community that work.
https://github.com/facebookresearch/llama/issues/936
I've also posted my question at Nvidia, but got no answers.
I'm using Jammy on WSL2 on Windows 11. This is my run.py code:
import torchimport transformersimport requestsprint(torch.cuda.is_available())device = torch.device("cuda" if torch.cuda.is_available() else "cpu")# Load model and adapter weights from local directorymodel = transformers.AutoModelForCausalLM.from_pretrained("/home/maxloo/src/pastoring/llama/llama-2-13b")model.to(device)adapter = transformers.AutoModelForCausalLM.from_pretrained("/home/maxloo/src/pastoring/adapter", config=transformers.configuration.AdapterConfig.from_json_file("adapter_config.json"))model.load_state_dict(adapter.state_dict())adapter.load_state_dict(model.state_dict())# Define promptprompt = "Hello, I am a chatbot."# Perform inferenceresponse = model.generate(prompt, max_length=50)# Print responseprint(response)
When I use Bash to run: python3 run.py
I expect a chat message to be displayed and a prompt for my chat input, but this is the actual output:
Killed
Could someone please help with this error?