8000 GitHub - tupshin/cassandra-stress: Equivalent of stress.py, but more powerful
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

tupshin/cassandra-stress

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

cassandra-stress is modeled after the stress.py script in Apache Cassandra's source distribution.

cassandra-stress is built on top of Hector, a well-tested and widely deployed Java client for Apache Cassandra. The benefits of using Hector for a tool like this are many:

< 7712 ul dir="auto">
  • JMX hooks into Hector for visibility into runtime statistics
  • Extensive configurability of which node(s) to target
  • Verbose logging output available
  • Hooks into performance counters to correllate client and server performance
  • cassandra-stress is a currently just a command line app which supports the following commands (syntactically identical to stress.py where possible):

    usage: stress [options]... url1,[[url2],[url3],...]
     -b,--batch-size <arg>   The number of rows in the batch_mutate call
     -c,--columns <arg>      The number of columsn to create per key
     -o,--operation <arg>    One of insert, read, rangeslice, multiget
     -t,--threads <arg>      The number of client threads to create
     -h,--help               Print this help message and exit
     -D,--discovery-delay <arg>      The amount of time to wait between runs
                                     of Auto host discovery. Providing a value
                                     enables this service
     -h,--help                       Print this help message and exit
     -L,--consistency-levels <arg>   Defaults to QUORUM for R+W, specified in
                                     the form of [read]:[write] eg. '-L
                                     ONE:ONE'
     -M,--max-wait <arg>             The Maximum time to wait on aquiring a
                                     connection from the pool
                                     (maxWaitTimeWhenExhausted). Default is
                                     forever.
     -m,--unframed                   Disable use of TFramedTransport
     -n,--num-keys <arg>             The number of keys to create
     -o,--operation <arg>            The type of operation: insert or select
     -R,--retry-delay <arg>          The amount of time to wait between runs
                                     of Downed host retry delay execution. 30
                                     seconds by default.
     -S,--skip-retry-delay           Disable downed host retry service
                                     execution.
     -T,--thrift-timeout <arg>       The ThriftSocketTimeout value.
     -t,--threads <arg>              The number of client threads we will
                                     create
     -w,--colwidth <arg>             The widht of the column in bytes. Default
                                     is 16
    

    This project now runs as a stand-alone binary via the Maven app-assempbler plugin. You must first clone the source of the project, then using maven, execute the install goal:

    mvn install

    Move to the directory where the artiffacts are assembled:

    cd taget/appassembler

    Now you can execute the script directly as needed:

    sh bin/stress -o insert -b 50 -n 12000 localhost:9160

    This will perform the operation then drop you into a shell in which you can perform additional operations. The command syntax is similar to the initial call, except that the operation is the only argument:

    [cassandra-stress] 
    usage: operation [options]
    operation can be one of: insert, read, rangeslice, multiget, replay [N]
     -o,--operation <arg>    One of insert, read, rangeslice, multiget, verify_last_insert (1)
     -b,--batch-size <arg>   The number of rows in the batch_mutate call
     -t,--threads <arg>      The number of client threads we will create
     -c,--columns <arg>              The number of columsn to create per key
     -C,--clients <arg>              The number of pooled clients to use
    

    To re-run the operation again, just type the operation name. So from the initial example above, re-running the read operation with the same parameters would be:

    (1) Must run single thread (-t 1) and QUORUM:QUORUM. Intended to test in a cluster of at least 3 nodes RF = 3.

    [cassandra-stress] read

    The main benefit of going into a shell is that it keeps the JVM and connection pool machinery warm, so additional test runs will not have the overhead of spinning everything up. This has the added benefit of providing conitnuous access to Hector's JMX stats as well.

    Todo (in rough priority)

    • Create ColumnFamily specifically for test
    • Add JMX innards for MBean driven command/control
    • Add replay N times functionality from shell
    • Add (much) better key distribution (gaussian, stdev argument)
    • SuperColumn support

    Misc.

    Offered under an MIT license (see LICENSE in the top level directory). As with any halfway decent load testing tool, you can generate a lot of load and put a system under duress or even cause it to fail. You are solely responsible for any issues experienced the execution of this tool may cause. Use at your own risk.

    Cheers,
    zznate

    About

    Equivalent of stress.py, but more powerful

    Resources

    License

    Stars

    Watchers

    Forks

    Packages

    No packages published

    Languages

    • Java 100.0%
    0