8000 Home · jrmwng/like2015 Wiki · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
Jeremy Wong edited this page Dec 13, 2015 · 20 revisions
8000

Welcome to the like2015 wiki!

This repository contains some codes (interfaces and implementations) I like in the year 2015. They are relatively mature that satisfying my existing needs. Therefore, they should be simple, practical and efficient in some senses.


like::atomic_shared_ptr<T> and like::atomic_weak_ptr<T>

Prevailing implementations of atomic_shared_ptr<T> use spin-lock and mutex for exclusive access. This implementation (like::atomic_shared_ptr<T>) uses shared-lock for reader access (At least no LOCK-prefix instruction is used to acquire a lock and 1 LOCK-prefix instruction is used to increment reference count) and lock-synchronization for writer access (1 LOCK-prefix instruction is used to decrement reference count). With less locked time and fewer LOCK-prefix instructions (please check what is less and what are fewer). It could be better than most implementations today. (BTW, IA32 CPU and WB memory)

like::shared_ptr<T> and like::weak_ptr<T>

Existing implementations (BOOST and Microsoft) of shared_ptr<T> use 2 volatile variables (1 for shared_ptr and 1 for weak_ptr) for reference counting. This implementation (like::shared_ptr<T>) combines the 2 variables into 1, resulting in less LOCK-prefix instructions executed (theoretically and hopefully; please check what are less). The trade-off is to use 16-bit (for example) instead of 32-bit for each reference counting.


sudoku_t<sudoku_9x9_traits>

It is a Sudoku solver using pre-emptive set technique in single function call only. It solves most human playable Sudoku puzzles in about 0.85 million CPU cycles with single thread and Intel(R) Core(TM) i5-3330 CPU @ 3.00GHz, though initial implementation spends about 10 million CPU cycles. Three optimizations were done with ~12x performance gain. The first optimization used SSE instructions to process 8 combinations in one iteration instead of 8 iterations. The second optimization eliminates some batches of 8-iteration and 64-iteration. The third optimization replaced some cache accesses with register accesses.

By the way, it can be customized for constraints in irregular shape (non-3x3 block).


like::xmm_ptr realizes the concept of utilizing XMM registers as an alternate of general registers, though only pointer is envisioned. Its interface is similar to that of std::tuple. For example,

typedef like::xmm_ptr<unsigned char const, char> ptr_type;
ptr_type xmmPtr(pInput, pOutput);

unsigned char const *pInputX  = get_ptr<0>(xmmPtr); // get first pointer
         char       *pOutputX = get_ptr<1>(xmmPtr); // get second pointer

xmmPtr += ptr_type(16, 16) * 2; // advance pointers

like::member

like::member provides a simple interface to access descendant member (field and result of method invocation) of a C++ object (or C++ structure), though array member is not envisioned.

struct coord_t
{
    int x;
    int y;
};
struct line_t
{
    coord_t start;
    coord_t end;
};
line_t astLine[10];

// ...

coord_t astPos[10];
int anStartX[10];

std::transform(std::cbegin(astLine), std::cend(astLine), astPos, like::member(&line_t::end));
std::transform(std::cbegin(astLine), std::cend(astLine), anStartX, like::member(&line_t::start, &coord_t::x));

like::order_by

like::order_by provides a simple interface to specifying ordering of elements.

std::sort(std::begin(astPos), std::end(astPos), like::order_by(&coord_t::y));
std::sort(std::begin(astPos), std::end(astPos), like::order_by(&coord_t::y, std::greater<int>()));
std::sort(std::begin(astLine), std::end(astLine), like::order_by(like::member(&line_t::start, &coord_t::x)));
0