LLMFarm_core swift library to work with large language models (LLM). It allows you to load different LLMs with certain parameters.
Based on ggml and llama.cpp by Georgi Gerganov.
Also used sources from:
- rwkv.cpp by saharNooby.
- Mia by byroneverson.
- MacOS (13+)
- iOS (16+)
- Various inferences
- Metal for llama inference (MacOS and iOS)
- Model setting templates
- Sampling from llama.cpp for other inference
- classifier-free guidance sampling from llama.cpp
- Other tokenizers support
- Restore context state (now only chat history)
- Metal for other inference
git clone https://github.com/guinmoon/llmfarm_core.swift
Add llmfarm_core
to your project using Xcode (File > Add Packages...) or by adding it to your project's Package.swift
file:
dependencies: [
.package(url: "https://github.com/guinmoon/llmfarm_core.swift")
]
To Debug llmfarm_core
package, do not forget to comment .unsafeFlags(["-Ofast"])
in Package.swift
.
Don't forget that the debug version is slower than the release version.
To build with QKK_64
support uncomment .unsafeFlags(["-DGGML_QKK_64"])
in Package.swift
.
import Foundation
import llmfarm_core
let maxOutputLength = 256
var total_output = 0
func mainCallback(_ str: String, _ time: Double) -> Bool {
print("\(str)",terminator: "")
total_output += str.count
if(total_output>maxOutputLength){
return true
}
return false
}
var input_text = "State the meaning of life."
let ai = AI(_modelPath: "llama-2-7b.ggmlv3.q4_K_M.bin",_chatName: "chat")
var params:ModelContextParams = .default
params.use_metal = true
try? ai.loadModel(ModelInference.LLamaInference,contextParams: params)
ai.model.promptFormat = .LLaMa_bin
let output = try? ai.model.predict(input_text, mainCallback)