Introduction to Batch Apex
Prior to the availability of Batch Apex, the only options for processing data exceeding thegovernor limits of triggers and controllers were tricky workarounds to shift work off of
the platform. For example, you might have hundreds of thousands of records spanning
multiple Lookup relationships to be summarized, de-duplicated, cleansed, or otherwise
modified en masse algorithmically.You could use the Web Services API to interact with
the Force.com data from outside of Force.com itself, or JavaScript to process batches of
data inside the web browser.These approaches are usually slow and brittle, requiring
lots of code and exposing you to data quality problems over time due to gaps in error
handling and recovery. Batch Apex allows you to keep the large, data-intensive processing
tasks within the platform, taking advantage of its close proximity to the data and transactional
integrity to create secure, reliable processes without the limits of normal, interactive
Apex code.This section introduces you to concepts and guidelines for using Batch Apex
to prepare you for hands-on work in the following section
Batch Apex Concepts
Batch Apex is an execution framework that splits a large dataset into subsets and providesthem to ordinary Apex programs that you develop, which continue to operate within
their usual governor limits.This means with some minor rework to make your code
operate as Batch Apex, you can process data volumes that would otherwise be prohibited
within the platform. By helping Salesforce break up your processing task, you are permitted
to run it within its platform.
A few key concepts in Batch Apex are used are:
>>Scope: The scope is the set of records that a Batch Apex process operates on. It can
consist of 1 record or up to 50 million records. Scope is usually expressed as a
SOQL statement, which is contained in a Query Locator, a system object that is
blessedly exempt from the normal governor limits on SOQL. If your scope is too
complex to be specified in a single SOQL statement, then writing Apex code to
generate the scope programmatically is also possible. Unfortunately, using Apex
code dramatically reduces the number of records that can be processed, because it is
subject to the standard governor limit on records returned by a SOQL statement.
>> Batch job: A batch job is a Batch Apex program that has been submitted for execution.
It is the runtime manifestation of your code, running asynchronously within
the Force.com platform. Because batch jobs run in the background and can take
many hours to complete their work, Salesforce provides a user interface for listing
batch jobs and their statuses, and to allow individual jobs to be canceled.This job
information is also available as a standard object in the database.Although the batch
job is not the atomic unit of work within Batch Apex, it is the only platform-provided
level at which you have control over a batch process.
>>Transaction: Each batch job consists of transactions, which are the governor limitfriendly
units of work you’re familiar with from triggers and Visualforce controllers.
By default, a transaction is up to 200 records, but you can adjust this downward in
code.When a batch job starts, the scope is split into a series of transactions. Each
transaction is then processed by your Apex code and committed to the database
independently.Although the same block of your code is being called upon to
process potentially thousands of transactions, the transactions themselves are normally
stateless. None of the variables within it are saved between invocations unless
you explicitly designate your Batch Apex code as stateful when it is developed.
Salesforce doesn’t provide information on whether your transactions are run in parallel
or serially, nor how they are ordered. Observationally, transactions seem to run
serially, in order based on scope.