Batches

Introduction

A batch is a group of read and write operations that are logically related. When an app uses Syncbase without synchronization, a batch is equivalent to an ACID transaction.

When an app uses Syncbase with synchronization, a batch no longer provides ACID semantics. Syncbase is a loosely coupled, decentralized, distributed storage system, so the guarantees of batches are appropriate for that environment.

While the edge cases prevent us from claiming ACID semantics, we believe that the behavior above strikes a good balance between implementable semantics and useful behavior for the developer and user.

Batches are not limited to the data within a collection. If a batch contains data from multiple collections, peers will receive only the parts of the batch they are allowed to see.

Using Batches

BatchDatabase is the entry point to the batch API. BatchDatabase is similar to Database except it provides commit and abort methods and all operations on collection references obtained from a BatchDatabase would be part of the batch.

RunInBatch

RunInBatch is the recommended way of doing batch operations. It detects concurrent batch errors and handles retries and commit/aborts automatically.

cat - <<EOF | sed 's///' >> $FILE
db.runInBatch(new Database.BatchOperation() {
  @Override
  public void run(BatchDatabase batchDb) throws SyncbaseException {
    Collection c1 = batchDb.createCollection();
    Collection c2 = batchDb.createCollection();

    c1.put("myKey", "myValue");
    c2.put("myKey", "myValue");

    // No need to commit. RunInBatch will commit and retry if necessary.
  }
}, new Database.BatchOptions());
EOF

Warning

Using collection references previously obtained from Database will have no atomicity effect when used in RunInBatch. New collection references must be obtained from BatchDatabase.

The following code snippet demonstrates the WRONG way of using batches.

cat - <<EOF | sed 's///' >> $FILE
// WRONG: c1 is NOT part of the batch.
final Collection c1 = db.createCollection();
{#dim}{#dim-children}db.runInBatch(new Database.BatchOperation() {
    @Override
    public void run(BatchDatabase batchDb) throws SyncbaseException {
        Collection c2 = batchDb.createCollection();{/dim-children}{/dim}
        // WRONG: Only mutations on c2 are atomic since c1 reference
        // was obtained from Database and not BatchDatabase.
        c1.put("myKey", "myValue");
        c2.put("myKey", "myValue");
{#dim}{#dim-children}        // No need to commit. RunInBatch will commit and retry if necessary.
    }
}, new Database.BatchOptions());{/dim-children}{/dim}
EOF

BeginBatch

BeginBatch is an alternative approach to starting a batch operation. Unlike RunInBatch, it does not manage retries and commit/aborts. They are left to the developers to manage themselves.

cat - <<EOF | sed 's///' >> $FILE
BatchDatabase batchDb = db.beginBatch(new Database.BatchOptions());

Collection c1 = batchDb.createCollection();
Collection c2 = batchDb.createCollection();

c1.put("myKey", "myValue");
c2.put("myKey", "myValue");

batchDb.commit();
EOF

Warning

Using collection references obtained from a BatchDatabase after the batch is committed or aborted will throw exceptions.

The following code snippet demonstrates the WRONG way of using batches.

cat - <<EOF | sed 's///' >> $FILE
// WRONG: c1 is NOT part of the batch.
Collection c1 = db.createCollection();
{#dim}{#dim-children}BatchDatabase batchDb = db.beginBatch(new Database.BatchOptions());

// c2 is part of the batch.
Collection c2 = batchDb.createCollection();{/dim-children}{/dim}

// WRONG: Only mutations on c2 are atomic since c1 reference was obtained
// from Database and not BatchDatabase.
c1.put("myKey", "myValue");
c2.put("myKey", "myValue");

batchDb.commit();

// WRONG: Throws exception since c2 is from an already committed batch.
c2.put("myKey", "myValue");
EOF

Summary