-
Notifications
You must be signed in to change notification settings - Fork 0
Thesis #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
AutomataI can explain that the idea behind the book is based on IO Automata, from Mark Tuttle. It is important to mention which is the goal of this model, which is made for probing that an algorithm is correct, which is not our case. It will be nice to explain briefly how this model helps to probe correctness but I'm not sure if Tuttle's thesis explains this. |
EventuateIs nice to explain that Eventuate uses some of this abstractions, because is a real world tool |
Lack of vocabulary (cappIO motivation)This is the main reason behind cappIO, the idea is to have a common vocabulary with a well-defined meaning (*). It would be really nice to reference some issue discovered by Jepsen. It mainly covers consistency issues but is good to explain the long-term goal. There is one talks about causal consistency (*) Another example is consistency is an overused term in different fields (reference Klepman pag 224). Maybe I should talk about this in the thesis and not in the tool because I don't use it. Reference https://medium.com/databasss/on-ways-to-agree-part-1-links-and-flp-impossibility-f6bd8a6a0980 |
Real use cases for BroadcastBeyond the examples from the book, it would be nice to find some real program (db, queueing system) that uses broadcast. Check https://github.com/basho/riak_core/wiki/Riak-Core-Broadcast Which is the relation between broadcast and consistency?Maybe this relation is in total-order broadcast that combines consensus algorithms with broadcast. (I need to read more in-depth the chapter of the book) Read Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS and Kleppmann's book again. |
Introduction to reliable ... 1.6 Chapter notesThe idea of using multiple, replicated processes for tolerating faults of individual processes links together most algorithms presented in this book. This paradigm can be traced back to the work on the Software-Implemented Fault Tolerance (SIFT) project in 1978, which addressed the challenging problem of building a fault-tolerant computer for aircraft control (Wensley et al. 1978). |
DS mantra
|
Particularity of DS = partial failures! One process/link fails but the others continue working. From DDIA page 323
I could use this but is not as simple as is described here because in non distributed systems you can also have multithreaded operations and if you use a database this is accessed by multiple users so you have to deal with consistency semantics (but from the ACID) |
BroadcastDistributed Systems Concepts and Design
Atomic Broadcast |
FLP impossibility ruleI can use the FLP to describe the importance of describing the system model in which we are going to design the algorithm, FLP says the consensus is impossible to achieve if at least one node may fail but only in the async model. It would be nice to show this scenario using cappio (only the screenshot), I mean when a single node fails and consensus is not achieved. DDIA Pag 353
Fun and Profit
Consensus321, 364-375, 554 DDIA pag 322
Distributed transactions and consensus (DDIA page 352)
2PCDDIA page 355
Fault-tolerance consensusDDIA page 365
Consensus algorithms and total order broadcast (DDIA page 366)
Epoch numbering and quorums(I need to read more about this) DDIA page 368
Limitations of consensusDDIA page 369
Types of consensus (from IRSDP)Split brainDDIA page 367
Fun and profit
The properties described above represents uniform consensus, "which is equivalent to regular consensus in asynchronous systems with unreliable failure detectors" (DDIA page 365) Equivalent problems
Interesting here to say that even single leader databases (e.g. they use snapshot isolation) (1) or processors (they use several cache layers) don't provide linearizability due to performance reasons. (1) Check this better because transaction isolation is not the same as consistency. Questions
|
AbstractionsDDIA pag 321
Fun and profit
|
Distributed ConsistencyDDIA Page 323 about eventual consistency
LinearizabilityDefined by Herlihy and Wing [1990] DDIA page 324
Recency guaranteeTODO Linearizability Versus SerializabilityDDIA page 329
Fun and profit ???
TODO (read Burckhardt y Viotti) In what circumstances is linearizability useful? (DDIA page 330)
Implementing Linearizable Systems (DDIA page 332)
Linearizability and network delays
Linearizability is stronger than causal consistency
Relation between linearizability and consensusDDIA page 350
DDIA page 352
Implementing linearizable storage using total order broadcast
Implementing total order broadcast using linearizable storage
More stuff |
Distributed Consistency vs ACIDDDIA page 323
I would say that is not even required to talk about delays and faults if you have multi-writer (???) or leaderless replication (Dynamo, Cassandra) Burckhardt page 135
|
Why CaPPIO?
|
CAPDDIA page 337A Critique of the CAP TheoremCAP Twelve years later: How the "Rules" have ChangedFun and profitCheck the references! maybe is something else interesting Others |
CRDTsReal use case in production ??? |
System modelsDDIA chapter 8 DDIA page 300
|
Safety and livenessDDIA page 308 IRSDP |
ZookeeperDDIA page 370
Of these features, only the linearizable atomic operations really require consensus. Fencing tokensTODO |
IntroductionIRSDP ✅
Client-server ✅
TODO explain why this is not what we are interested in. Why design distributed systems? ✅
Fun and profit
Fault tolerance / Availability (already discussed above in IRSDP), if you only have one process and that process fails you have nothing. ✅
Some characteristics of distributed systems ✅
(DDIA page 300)
Partial failures (DDIA page 274) ✅
Things break (DDIA page 276) ✅
Reliable from unreliable components (DDIA page 277)
Network is unreliable (DDIA page 278) ✅
Network faults in practice (DDIA page 279) ✅
TODO See The Network is Reliable. An informal survey of real-world communications failures TODO there are other stuff to read in DDIA page 280
Unreliable clocks (DDIA page 287) ✅
Timestamp for ordering events (DDIA page 291)
|
: white_check_mark : |
GC pauses is interesting to add, maybe in the introduction when we are discussing about things that may wrong. See DDIA page 295, Process Pauses |
Transactions and ACIDDDIA page 321The best way of building fault-tolerant systems is to find some general-purpose abstractions with useful guarantees, implement them once, and then let applications rely on those guarantees. This is the same approach as we used with transactions in Chapter 7: by using a transaction, the application can pretend that there are no crashes (atomicity), that nobody else is concurrently accessing the database (isolation), and that storage devices are perfectly reliable (durability). Even though crashes, race conditions, and disk failures do occur, the transaction abstraction hides those problems so that the application doesn’t need to worry about them. DDIA page 221In the harsh reality of data systems, many things can go wrong: Safety guarantees provided by ACID (DDIA page 223)ACIDACID, which stands for Atomicity, Consistency, Isolation, and Dura‐ bility. It was coined in 1983 by Theo Härder and Andreas Reuter [7] in an effort to establish precise terminology for fault-tolerance mechanisms in databases. AtomicFor example, in multi-threaded programming, if one thread executes an atomic operation, that means there is no way that another thread could see the half-finished result of the operation. The system can only be in the state it was before the operation or after the operation, not something in between. ConsistencyThe word consistency is terribly overloaded:
DurabilityThe purpose of a database system is to provide a safe place where data can be stored without fear of losing it. Durability is the promise that once a transaction has com‐ mitted successfully, any data it has written will not be forgotten, even if there is a hardware fault or the database crashes. IsolationMost databases are accessed by several clients at the same time. That is no problem if they are reading and writing different parts of the database, but if
967A
they are accessing the same database records, you can run into concurrency problems (race conditions). The need for multi-object transactions (DDIA page 231)But do we need multi-object transactions at all? Would it be possible to implement any application with only a key-value data model and single-object operations? Weak isolation levels (DDIA page 233)TODO NoSQL and Distributed transactions (DDIA page 223)In the late 2000s, nonrelational (NoSQL) databases started gaining popularity. They aimed to improve upon the relational status quo by offering a choice of new data models (see Chapter 2), and by including replication (Chapter 5) and partitioning (Chapter 6) by default. Transactions were the main casualty of this movement: many of this new generation of databases abandoned transactions entirely, or redefined the word to describe a much weaker set of guarantees than had previously been under‐ stood [4]. SpannerBASE systems (DDIA page 223)(Systems that do not meet the ACID criteria are sometimes called BASE, which stands for Basically Available, Soft state, and Eventual consistency [9]. This is even more vague than the definition of ACID. It seems that the only sensible definition of BASE is “not ACID”; i.e., it can mean almost anything you want.) |
Where CappIO can help?
|
State of the art (kinda)
Topics
Optional
Partitioning(va a quedar muy descolgado del resto del trabajo)The text was updated successfully, but these errors were encountered: