8000 Comparing Bip-Rep:main...dsd:main · Bip-Rep/sherpa · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: Bip-Rep/sherpa
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: dsd/sherpa
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: main
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 6 commits
  • 19 files changed
  • 1 contributor

Commits on Jun 30, 2023

  1. Adjust default prompts

    Set the main default prompt to chat-with-bob from llama.cpp.
    This seems to produce much more useful conversations with llama-7b and
    orca-mini-3b models that I have tested.
    
    Also make the reverse prompt consistently "User:" in both default prompt
    options, and set the default reverse prompt detection to the same value.
    dsd committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    772e6c2 View commit details
    Browse the repository at this point in the history
  2. Android: disable ARM32 build

    llama.cpp doesn't build for ARM32 because it calls into 64 bit neon
    intrinsics. Not worth fixing that; lets just not offer this app on
    ARM32.
    dsd committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    0af43ba View commit details
    Browse the repository at this point in the history
  3. Build llama.cpp within flutter build process

    Rather than using prebuilt libraries, build the llama.cpp git submodule
    during the regular app build process.
    
    The library will now be installed in a standard location, which simplifies
    the logic needed to load it at runtime; there is no need to ship it as an
    asset.
    
    This works on Android, and also enables the app to build and run on Linux.
    Windows build is untested.
    
    One unfortunate side effect is that when building the app in Flutter's
    debug mode, the llama lib is built unoptimized and it works very very
    slowly, to the point where you might suspect the app is broken.
    However release mode seems as fast as before.
    dsd committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    1e64ad7 View commit details
    Browse the repository at this point in the history

Commits on Jul 1, 2023

  1. Update llama.cpp and move core processing to native code

    Update llama.cpp to the latest version as part of an effort to make this
    app usable on my Samsung Galaxy S10 smartphone.
    
    The newer llama.cpp includes a double-close fix which was causing the app
    to immediately crash upon starting the AI conversation (llama.cpp commit
    47f61aaa5f76d04).
    
    It also adds support for 3B models, which are considerably smaller. The
    llama-7B models were causing Android's low memory killer to terminate
    Sherpa after just a few words of conversation, whereas new models such as
    orca-mini-3b.ggmlv3.q4_0.bin work on this device without quickly exhausting
    all available memory.
    
    llama.cpp's model compatibility has changed within this update, so ggml
    files that were working in the previous version are unlikely to work now;
    they need converting. However the orca-mini offering is already in the
    new format and works out of the box.
    
    llama.cpp's API has changed in this update. Rather than rework the Dart
    code, I opted to leave it in C++, using llama.cpp's example code as a base.
    This solution is included in a new "llamasherpa" library which calls
    into llama.cpp. Since lots of data is passed around in large arrays,
    I expect running this in Dart had quite some overhead, and this native
    approach should perform considerably faster.
    
    This eliminates the need for Sherpa's Dart code to call llama.cpp directly,
    so there's no need to separately maintain a modified version of llama.cpp
    and we can use the official upstream.
    dsd committed Jul 1, 2023
    Configuration menu
    Copy the full SHA
    e483499 View commit details
    Browse the repository at this point in the history

Commits on Jul 3, 2023

  1. Fix initialization of default prompt

    On first run on my Android device, the pre-prompt is empty, it does
    not get initialized to any value.
    
    This is because SharedPreferences performs asynchronous disk I/O,
    and initDefaultPrompts() uses a different SharedPreferences instance from
    getPrePrompts(). There's no guarantee that a preferences update on one
    instance will become immediately available in another.
    
    Tweak the logic to not depend on synchronization between two
    SharedPreferences instances.
    dsd committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    ec3dd48 View commit details
    Browse the repository at this point in the history
  2. Pass initial message separately from prompt

    The llama.cpp logic is built around the prompt ending with the
    reverse-prompt and the actual user input being passed separately.
    
    Adjust Sherpa to do the same, rather than appending the first line of
    user input to the prompt.
    dsd committed Jul 3, 2023
    Configuration menu
    Copy the full SHA
    06c75c3 View commit details
    Browse the repository at this point in the history
Loading
0