8000 Tags · sekwiatkowski/komputation · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
This repository was archived by the owner on Nov 3, 2020. It is now read-only.

Tags: sekwiatkowski/komputation

Tags

v0.12.5

Toggle v0.12.5's commit message
v0.12.5:

- Switched CUDA C development to CLion
- Used the __JETBRAINS_IDE__ macro to declare CUDA's language extensions
- Header include paths are now relative to the given source file
- For real-time compilation with nvrtc, all include directives in the source code are replaced with a sequence of directives that use paths relative to the CUDA resource base directory.
- Header files are now inferred from the source code and do no longer have to be specified in kernel instructions.
- Fixed comparisons in the binary testing kernel
- Replaced double constants with floats
- Removed the unused numberEntries parameter from the kernel that replaces NaNs
- Removed unused par
8000
ameter from functions used for backpropagation kernels of recurrent layers
- Resolved a name conflict in the max-pooling kernel
- Simplified the definition of the stack of convolutional layers in the embedding toy demo with two filter widths

v0.12.3

Toggle v0.12.3's commit message
v0.12.3:

- Finished implementing experimental support for (fixed-length, left-to-right, vanilla) GPU-accelerated recurrent neural networks
- Fixed the allocation of memory for the propagation result in CudaSquaredLoss
- Added a helper function to access and print arrays on the device
- Implemented a SumKernel to add up accumulated gradients for parameters that are in used in each instance
- Added CUDA helper functions to cooperatively copy an array and add up two arrays
- Moved the entrywise CUDA activation functions to header files
- Removed unused array fill kernels
- Added a pointer to the maximum number of input columns in BaseCudaContinuation
- The shared parameter is passed directly to the CPU-specific ParameterizedSeries instruction. This makes it possible to use the same entries for the CPU and CUDA.
- Removed the CUDA IDs from the ResultExtraction enumeration
- Set the device activity function IDs to be constant
- Added a CUDA version of the increment demo
- Mentioned the demo in the README
- Replaced kotlin-stdlib-jre8 with kotlin-stdlib-jdk8

v0.12.2

Toggle v0.12.2's commit message
v0.12.2:

Removed the projection of a zero initial state vector in the first step in CpuRecurrent

v0.12.1

Toggle v0.12.1's commit message
v0.12.1:

- The summation of gradients based on the parameter index in CudaLookup is now deterministic.
- Removed the hash table kernel
- Replaced the use of the hash table with a pointer to the parameter indices
- Rewrote the group sum kernel based on information about the indices of the first occurrence of a parameter and its remaining occurrences
- Added a kernel two add up two arrays
- Fixed backward propagation in CudaStack by replacing the cuBLAS axpy operation with the use of the addition kernel
- The input memory can now store information about duplicate occurrences.
- Improved the name of the setters in InputMemory
- The optimizer kernels now check if the count is strictly positive.
- Moved reusable batch size and output entries members to BaseCudaEntryPoint
- Increased the batch size to 16 and changed hyperparameters in the TREC demos with two filter widths.
- Mentioned the CUDA TREC demo with two filters in the README

v0.12.0

Toggle v0.12.0's commit message
v0.12.0:

- Simplified the specification of networks
- The input dimensions over the continuations of the network are computed automatically.
- Removed the Layer suffix from instruction factory functions
- Overloaded the instruction factory function to simplify the specification of initialization strategies
- Renamed Direction.Forward/Backward to Direction.LeftToRight/RightToLeft
- Shortened "ActivationFunction" to "Activation" and "ActivationLayer" to "Activation"
- Generalized BaseCudaEntrywiseActivationLayer to BaseCudaEntrywiseLayer
- The specification of the minimum length is required in the lookup instruction and optional in the input instruction.
- TREC categories are indexed based on all available training data.
- Renamed "forward" layer to "continuation" and shortened "combination layer" to "combination"
- Moved the architecture-specific interfaces from the general package to the respective architecture-specific packages
- Improved the names used in SparseAccumulator and SparseUpdate
- The series is passed on to the method of the ResultExtractionStrategy interface.
- Introduced CpuCombinationSeries to implement the addition of the weighted previous state and the weighted current input.
- Added the Cpu prefix to Series and ParameterizedSeries in preparation of the CUDA implementation of recurrent neural networks
- Optimized the performance RNN implementation by adding the bias to the input rather than adding at each step
- Fixed the specification of the number of rows in CpuLogisticLoss
- Renamed the "Negation" demo to "Not"
- Stopped experimenting with dynamic parallelism
- CudaIdentity now implements CudaActivation.
- Introduced a base class for higher-order layers
- Differentiated the CUDA continuation base class into one class for layers that change the number of columns and one class for layers that don't.
- Reused the code for the computation of launch configurations in CudaHashing and CudaGroupSum
- Fixed the sparse updated in CudaLookup
- Added a "copy" helper function that encapsulates System.arraycopy for copies
- Added a setter to InputMemory that caches all possible data
- Clarified references to the hash table in CUDA optimizers
- CUDA layers pass a pointer to the length of the input data and the maximum length within the batch.
- Unified the activation instruction factory functions over the two architectures
- Moved the concatenation layer to a separate package
- Added an instruction for weightings with shared parameters that is separate from the instruction for the weighting layer that uses a dedicated parameter
- The two weighting instructions inherit from the new BaseWeighting class.
- Added instructions for the tree series types: Series, ParameterizedSeries and CombinationSeries
- Refactored the CPU RNN factory function based on the instructions
- Continuation instructions implement HasOutputDimensions and CanSetInputDimensions, while entry point instructions only implement HasOutputDimensions.
- Inlined some CUDA C helper functions
- Moved the division by 2 in the squared loss function from the host to the device
- Added the missing scaling of gradients in some of the optimization kernels
- Refactored the for loops used to update entries in optimization kernels
- Temporarily removed the CUDA forward layer tests
- Updated the links in the README
- Upgraded to Kotlin 1.2.10

v0.11.3

Toggle v0.11.3's commit message
v0.11.3:

- Added an instruction for bidirectional recurrent layers
- Rearranged the parameters in the factory functions of the recurrent layer and the dropout layer instruction
- Overloaded the dropout layer instruction factory function for the case of vectorial input
- Mentioned the bidirectional recurrent layer and the new running total demos in the README
- Updated the TREC sample code in the README

v0.11.2

Toggle v0.11.2's commit message
v0.11.2:

- The recurrent layer can now emit either all steps or the last step.
- Added demos that compute the total of fixed-length and variable-length input
- Mentioned the new recurrent layer implementation in the README
- Included links to the demos in the README

v0.11.1

Toggle v0.11.1's commit message
v0.11.1:

- Implemented testing support for multi-class and binary classification problems
- Constructors of optimization instructions are now internal.
- Removed AttentiveDecoder and the reverse demo based on that decoder
- Removed its specific dependencies: column repetition, row summation and transposition

v0.11.0

Toggle v0.11.0's commit message
v0.11.0:

- Implemented and tested Adam optimization for CUDA
- Set a delta in the equality assertions of CUDA optimization tests

v0.10.6

Toggle v0.10.6's commit message
v0.10.6:

- Fixed compilation errors in the kernels for SGD and Momentum
- Implemented and tested Adadelta optimization for CUDA
0