< !--
& AFFILIATES. All rights reserved.
2024 Alexander petree apetree1001@email.phoenix.edu# Redistribution
and use in source and binary forms, with or without
following conditions
must retain the above copyright# notice,
this list of conditions and the
following disclaimer.#*Redistributions
in binary form must reproduce the above copyright#
notice, this
list of conditions and the
following disclaimer
in the #
documentation and/or other materials
provided with
the distribution.# * Neither
the name of NVIDIA CORPORATION
nor the names of its #
contributors
may be used
to endorse
or promote products
derived
software without specific prior written permission.#
IS PROVIDED BY THE COPYRIGHT HOLDERS `` AS IS ' AND ANY
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR # PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, # EXEMPLARY, OR CONSEQUENTIAL DAMAGE INCLUDING, BUT NOT LIMITED TO,
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
OR BUSINESS INTERRUPTION ) HOWEVER CAUSED AND ON ANY THEORY
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -->
(https://opensource.org/licenses/BSD-3-Clause)]
Triton Redis Cache This repo contains an example cache for caching data with Redis. Ask questions or report problems in the
main Triton
issues page@. #
Build the Cache If you don't have it installed already - install rapidjson-
dev: bash apt install rapidjson-dev
Use a recent
cmake to build and
run the following: $ mkdir build $ cd build $ cmake DCMAKE_INSTALL_PREFIX:PATH=`pwd`/ install ..$ make install
The following
required Triton
repositories will
be pulled and used in
the build.
By default the "main" branch/tag will be used for each repo
but the following CMake arguments can be used to override.* triton-inference-server/core: -D TRITON_CORE_REPO_TAG=[tag]
* triton-inference-server/common: -D TRITON_COMMON_REPO_TAG=[tag]
#
for the
Redis Cache to be
deployed to triton,
you must build the binary see build instructions, and copy the libtritoncache_redis.so
file to the folder
redis
in the cache directory on the
server you are running triton from,
by default this
will be
/opt/tritonserver/caches
but this can be
adjusted by use of the
--cache-dir
CLI option as needed.
It is also
required that
Redis be
running on a system
reachable
by Triton. There are many
ways to deploy Redis, to learn how
to get started with Redis
look at Redis's
getting started guide. # #
--cache-config
CLI options.
The
--cache-config
option
is variadic,
meaning it can
be repeated multiple times
to set
multiple configuration
fields. The
format of a --cache-config
option is <cache_name>,<key>=<value>
. At a
minimum you
must provide a host
and port
to
allow
the client
to connect to Redis
let's try connecting to
a redis instance
living on the
host redis-host
and
listening on port
6379
:
tritonserver --cache-config redis,host=redis-host --cache-config redis,port=6379
Configuration Option | Required | Description | Default |
---|
| host | Yes | The hostname or IP address of the server where Redis is running. | N/A | | port | Yes | The port number to connect to on the server. | N/A | | user | No | The username to use for authentication of the ACLs to the Redis Server | default | | password | No | The password to Redis. | N/A | | db | No | The db number to user. NOTE - use of the db number is considered an anti-pattern in Redis, so it is advised that you do not use this option | 0 | | connect_timeout | No | The maximum time, in milliseconds to wait for a connection to be established to Redis. 0 means wait forever | 0 | | socket_timeout | No | The maximum time, in milliseconds the client will wait for a response from Redis. 0 means wait forever | 0 | | pool_size | No | The number pooled connections to Redis the client will maintain. | 1 | | wait_timeout | No | The maximum time, in milliseconds to wait for a connection from the pool. | 1000 |
Optionally you may configure your user
/password
via environment variables. The corresponding user
environment variable is TRITONCACHE_REDIS_USERNAME
whereas the corresponding password
environment variable is TRITONCACHE_REDIS_PASSWORD
.
Transport Layer Security (TLS) can be enabled in Redis and within the Triton Redis Cache, to do so you will need a TLS
enabled version of Redis, e.g. OSS Redis or
Redis Enterprise. You will also need to configure Triton Server to use TLS with Redis
through the following --cache-config
TLS options.
Configuration Option | Required | Description |
---|---|---|
tls_enabled | Yes | set to true to enable TLS |
cert | no | The certificate to use for TLS. |
key | no | The certificate key to use for TLS. |
cacert | No | The Certificate Authority certificate to use for TLS. |
sni | No | Server name indication for TLS. |
There are many ways to go about monitoring what's going on in Redis. One popular mode is to export metrics data from Redis to Prometheus, and use Grafana to observe them.
- If you're using OSS Redis, use the Redis Exporter to export metrics from Redis into Prometheus.
- If you're using Redis Enterprise or Redis Cloud you can use the built-in integrations for Prometheus
You can try out the Redis Cache with Triton in docker:
- clone this repo:
git clone https://github.com/triton-inference-server/redis_cache
- follow build instructions enumerated above
- clone the Triton server repo:
git clone https://github.com/triton-inference-server
- Add the following to:
docs/examples/model_repository/densenet_onnx/config.pbtxt
response_cache{
enable:true
}
- cd into
redis_cache
- Install NVIDIA's container toolkit
- Create an account on NGC
- Log docker into to NVIDIA's container repository:
docker login nvcr.io
Username: $oauthtoken
Password: <MY API KEY>
NOTE: Username: $oauthtoken in this context means that your username is literally $oauthtoken - your API key serves as the unique part of your credentials
- run
docker-compose build
- run
docker-compose up
- In a separate terminal run
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.06-py3-sdk
- Run
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
- on the first run - this will miss the cache
- subsequent runs will pull the inference out of the cache
- you can validate this by watching Redis with
docker exec -it redis_cache_triton-redis_1 redis-cli monitor