The main class to use when interacting with a Cassandra cluster. Typically, one instance of this class will be created for each separate Cassandra cluster that your application interacts with.
Example usage:
>>> from cassandra.cluster import Cluster >>> cluster = Cluster(['192.168.1.1', '192.168.1.2']) >>> session = cluster.connect() >>> session.execute("CREATE KEYSPACE ...") >>> ... >>> cluster.shutdown()
Cluster
and Session
also provide context management functions which implicitly handle shutdown when leaving scope.
executor_threads
defines the number of threads in a pool for handling asynchronous tasks such as extablishing connection pools or refreshing metadata.
Any of the mutable Cluster attributes may be set as keyword arguments to the constructor.
contact_points
= ['127.0.0.1']¶
The list of contact points to try connecting for cluster discovery.
Defaults to loopback interface.
Note: When using DCAwareLoadBalancingPolicy
with no explicit local_dc set (as is the default), the DC is chosen from an arbitrary host in contact_points. In this case, contact_points should contain only nodes from a single, local DC.
Note: In the next major version, if you specify contact points, you will also be required to also explicitly specify a load-balancing policy. This change will help prevent cases where users had hard-to-debug issues surrounding unintuitive default load-balancing policy behavior.
port
= 9042¶
The server-side port to open connections to. Defaults to 9042.
cql_version
= None¶
If a specific version of CQL should be used, this may be set to that string version. Otherwise, the highest CQL version supported by the server will be automatically used.
protocol_version
= 4¶
The maximum version of the native protocol to use.
See ProtocolVersion
for more information about versions.
If not set in the constructor, the driver will automatically downgrade version based on a negotiation with the server, but it is most efficient to set this to the maximum supported by your version of Cassandra. Setting this will also prevent conflicting versions negotiated if your cluster is upgraded.
compression
= True¶
Controls compression for communications between the driver and Cassandra. If left as the default of True
, either lz4 or snappy compression may be used, depending on what is supported by both the driver and Cassandra. If both are fully supported, lz4 will be preferred.
You may also set this to ‘snappy’ or ‘lz4’ to request that specific compression type.
Setting this to False
disables compression.
auth_provider
¶
When protocol_version
is 2 or higher, this should be an instance of a subclass of AuthProvider
, such as PlainTextAuthProvider
.
When protocol_version
is 1, this should be a function that accepts one argument, the IP address of a node, and returns a dict of credentials for that node.
When not using authentication, this should be left as None
.
load_balancing_policy
¶
An instance of policies.LoadBalancingPolicy
or one of its subclasses.
Changed in version 2.6.0.
Defaults to TokenAwarePolicy
(DCAwareRoundRobinPolicy
). when using CPython (where the murmur3 extension is available). DCAwareRoundRobinPolicy
otherwise. Default local DC will be chosen from contact points.
Please see DCAwareRoundRobinPolicy
for a discussion on default behavior with respect to DC locality and remote nodes.
reconnection_policy
= <cassandra.policies.ExponentialReconnectionPolicy object>¶
An instance of policies.ReconnectionPolicy
. Defaults to an instance of ExponentialReconnectionPolicy
with a base delay of one second and a max delay of ten minutes.
default_retry_policy
= <cassandra.policies.RetryPolicy object>¶
A default policies.RetryPolicy
instance to use for all Statement
objects which do not have a retry_policy
explicitly set.
conviction_policy_factory
= <class 'cassandra.policies.SimpleConvictionPolicy'>¶
A factory function which creates instances of policies.ConvictionPolicy
. Defaults to policies.SimpleConvictionPolicy
.
address_translator
= <cassandra.policies.IdentityTranslator object>¶
policies.AddressTranslator
instance to be used in translating server node addresses to driver connection addresses.
metrics_enabled
= False¶
Whether or not metric collection is enabled. If enabled, metrics
will be an instance of Metrics
.
metrics
= None¶
An instance of cassandra.metrics.Metrics
if metrics_enabled
is True
, else None
.
ssl_options
= None¶
A optional dict which will be used as kwargs for ssl.wrap_socket()
when new sockets are created. This should be used when client encryption is enabled in Cassandra.
By default, a ca_certs
value should be supplied (the value should be a string pointing to the location of the CA certs file), and you probably want to specify ssl_version
as ssl.PROTOCOL_TLSv1
to match Cassandra’s default protocol.
Changed in version 3.3.0.
In addition to wrap_socket
kwargs, clients may also specify 'check_hostname': True
to verify the cert hostname as outlined in RFC 2818 and RFC 6125. Note that this requires the certificate to be transferred, so should almost always require the option 'cert_reqs': ssl.CERT_REQUIRED
. Note also that this functionality was not built into Python standard library until (2.7.9, 3.2). To enable this mechanism in earlier versions, patch ssl.match_hostname
with a custom or back-ported function.
sockopts
= None¶
An optional list of tuples which will be used as arguments to socket.setsockopt()
for all created sockets.
Note: some drivers find setting TCPNODELAY beneficial in the context of their execution model. It was not found generally beneficial for this driver. To try with your own workload, set sockopts = [(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)]
max_schema_agreement_wait
= 10¶
The maximum duration (in seconds) that the driver will wait for schema agreement across the cluster. Defaults to ten seconds. If set <= 0, the driver will bypass schema agreement waits altogether.
metadata
= None¶
An instance of cassandra.metadata.Metadata
.
connection_class
= <class 'cassandra.io.libevreactor.LibevConnection'>¶
This determines what event loop system will be used for managing I/O with Cassandra. These are the current options:
cassandra.io.asyncorereactor.AsyncoreConnection
cassandra.io.libevreactor.LibevConnection
cassandra.io.eventletreactor.EventletConnection
(requires monkey-patching - see doc for details)cassandra.io.geventreactor.GeventConnection
(requires monkey-patching - see doc for details)cassandra.io.twistedreactor.TwistedConnection
cassandra.io.asyncioreactor.AsyncioConnection
By default, AsyncoreConnection
will be used, which uses the asyncore
module in the Python standard library.
If libev
is installed, LibevConnection
will be used instead.
If gevent
or eventlet
monkey-patching is detected, the corresponding connection class will be used automatically.
AsyncioConnection
, which uses the asyncio
module in the Python standard library, is also available, but currently experimental. Note that it requires asyncio
features that were only introduced in the 3.4 line in 3.4.6, and in the 3.5 line in 3.5.1.
control_connection_timeout
= 2.0¶
A timeout, in seconds, for queries made by the control connection, such as querying the current schema and information about nodes in the cluster. If set to None
, there will be no timeout for these queries.
idle_heartbeat_interval
= 30¶
Interval, in seconds, on which to heartbeat idle connections. This helps keep connections open through network devices that expire idle connections. It also helps discover bad connections early in low-traffic scenarios. Setting to zero disables heartbeats.
idle_heartbeat_timeout
= 30¶
Timeout, in seconds, on which the heartbeat wait for idle connection responses. Lowering this value can help to discover bad connections earlier.
schema_event_refresh_window
= 2¶
Window, in seconds, within which a schema component will be refreshed after receiving a schema_change event.
The driver delays a random amount of time in the range [0.0, window) before executing the refresh. This serves two purposes:
1.) Spread the refresh for deployments with large fanout from C* to client tier, preventing a ‘thundering herd’ problem with many clients refreshing simultaneously.
2.) Remove redundant refreshes. Redundant events arriving within the delay period are discarded, and only one refresh is executed.
Setting this to zero will execute refreshes immediately.
Setting this negative will disable schema refreshes in response to push events (refreshes will still occur in response to schema change responses to DDL statements executed by Sessions of this Cluster).
topology_event_refresh_window
= 10¶
Window, in seconds, within which the node and token list will be refreshed after receiving a topology_change event.
Setting this to zero will execute refreshes immediately.
Setting this negative will disable node refreshes in response to push events.
See schema_event_refresh_window
for discussion of rationale
status_event_refresh_window
= 2¶
Window, in seconds, within which the driver will start the reconnect after receiving a status_change event.
Setting this to zero will connect immediately.
This is primarily used to avoid ‘thundering herd’ in deployments with large fanout from cluster to clients. When nodes come up, clients attempt to reprepare prepared statements (depending on reprepare_on_up
), and establish connection pools. This can cause a rush of connections and queries if not mitigated with this factor.
prepare_on_all_hosts
= True¶
Specifies whether statements should be prepared on all hosts, or just one.
This can reasonably be disabled on long-running applications with numerous clients preparing statements on startup, where a randomized initial condition of the load balancing policy can be expected to distribute prepares from different clients across the cluster.
reprepare_on_up
= True¶
Specifies whether all known prepared statements should be prepared on a node when it comes up.
May be used to avoid overwhelming a node on return, or if it is supposed that the node was only marked down due to network. If statements are not reprepared, they are prepared on the first execution, causing an extra roundtrip for one or more client requests.
connect_timeout
= 5¶
Timeout, in seconds, for creating new connections.
This timeout covers the entire connection negotiation, including TCP establishment, options passing, and authentication.
schema_metadata_enabled
= True¶
Flag indicating whether internal schema metadata is updated.
When disabled, the driver does not populate Cluster.metadata.keyspaces on connect, or on schema change events. This can be used to speed initial connection, and reduce load on client and server during operation. Turning this off gives away token aware request routing, and programmatic inspection of the metadata model.
token_metadata_enabled
= True¶
Flag indicating whether internal token metadata is updated.
When disabled, the driver does not query node token information on connect, or on topology change events. This can be used to speed initial connection, and reduce load on client and server during operation. It is most useful in large clusters using vnodes, where the token map can be expensive to compute. Turning this off gives away token aware request routing, and programmatic inspection of the token ring.
timestamp_generator
= None¶
An object, shared between all sessions created by this cluster instance, that generates timestamps when client-side timestamp generation is enabled. By default, each Cluster
uses a new MonotonicTimestampGenerator
.
Applications can set this value for custom timestamp behavior. See the documentation for Session.timestamp_generator()
.
connect
(keyspace=None, wait_for_all_pools=False)[source]¶
Creates and returns a new Session
object. If keyspace is specified, that keyspace will be the default keyspace for operations on the Session
.
shutdown
()[source]¶
Closes all sessions and connection associated with this Cluster. To ensure all connections are properly closed, you should always call shutdown() on a Cluster instance when you are done with it.
Once shutdown, a Cluster should not be used for any purpose.
register_user_type
(keyspace, user_type, klass)[source]¶
Registers a class to use to represent a particular user-defined type. Query parameters for this user-defined type will be assumed to be instances of klass. Result sets for this user-defined type will be instances of klass. If no class is registered for a user-defined type, a namedtuple will be used for result sets, and non-prepared statements may not encode parameters for this type correctly.
keyspace is the name of the keyspace that the UDT is defined in.
user_type is the string name of the UDT to register the mapping for.
klass should be a class with attributes whose names match the fields of the user-defined type. The constructor must accepts kwargs for each of the fields in the UDT.
This method should only be called after the type has been created within Cassandra.
Example:
cluster = Cluster(protocol_version=3) session = cluster.connect() session.set_keyspace('mykeyspace') session.execute("CREATE TYPE address (street text, zipcode int)") session.execute("CREATE TABLE users (id int PRIMARY KEY, location address)") # create a class to map to the "address" UDT class Address(object): def __init__(self, street, zipcode): self.street = street self.zipcode = zipcode cluster.register_user_type('mykeyspace', 'address', Address) # insert a row using an instance of Address session.execute("INSERT INTO users (id, location) VALUES (%s, %s)", (0, Address("123 Main St.", 78723))) # results will include Address instances results = session.execute("SELECT * FROM users") row = results[0] print row.id, row.location.street, row.location.zipcode
register_listener
(listener)[source]¶
Adds a cassandra.policies.HostStateListener
subclass instance to the list of listeners to be notified when a host is added, removed, marked up, or marked down.
unregister_listener
(listener)[source]¶
Removes a registered listener.
add_execution_profile
(name, profile, pool_wait_timeout=5)[source]¶
Adds an ExecutionProfile
to the cluster. This makes it available for use by name
in Session.execute()
and Session.execute_async()
. This method will raise if the profile already exists.
Normally profiles will be injected at cluster initialization via Cluster(execution_profiles)
. This method provides a way of adding them dynamically.
Adding a new profile updates the connection pools according to the specified load_balancing_policy
. By default, this method will wait up to five seconds for the pool creation to complete, so the profile can be used immediately upon return. This behavior can be controlled using pool_wait_timeout
(see concurrent.futures.wait for timeout semantics).
set_max_requests_per_connection
(host_distance, max_requests)[source]¶
Sets a threshold for concurrent requests per connection, above which new connections will be created to a host (up to max connections; see set_max_connections_per_host()
).
Pertains to connection pool management in protocol versions {1,2}.
get_max_requests_per_connection
(host_distance)[source]¶
set_min_requests_per_connection
(host_distance, min_requests)[source]¶
Sets a threshold for concurrent requests per connection, below which connections will be considered for disposal (down to core connections; see set_core_connections_per_host()
).
Pertains to connection pool management in protocol versions {1,2}.
get_min_requests_per_connection
(host_distance)[source]¶
get_core_connections_per_host
(host_distance)[source]¶
Gets the minimum number of connections per Session that will be opened for each host with HostDistance
equal to host_distance. The default is 2 for LOCAL
and 1 for REMOTE
.
This property is ignored if protocol_version
is 3 or higher.
set_core_connections_per_host
(host_distance, core_connections)[source]¶
Sets the minimum number of connections per Session that will be opened for each host with HostDistance
equal to host_distance. The default is 2 for LOCAL
and 1 for REMOTE
.
Protocol version 1 and 2 are limited in the number of concurrent requests they can send per connection. The driver implements connection pooling to support higher levels of concurrency.
If protocol_version
is set to 3 or higher, this is not supported (there is always one connection per host, unless the host is remote and connect_to_remote_hosts
is False
) and using this will result in an UnsupporteOperation
.
get_max_connections_per_host
(host_distance)[source]¶
Gets the maximum number of connections per Session that will be opened for each host with HostDistance
equal to host_distance. The default is 8 for LOCAL
and 2 for REMOTE
.
This property is ignored if protocol_version
is 3 or higher.
set_max_connections_per_host
(host_distance, max_connections)[source]¶
Sets the maximum number of connections per Session that will be opened for each host with HostDistance
equal to host_distance. The default is 2 for LOCAL
and 1 for REMOTE
.
If protocol_version
is set to 3 or higher, this is not supported (there is always one connection per host, unless the host is remote and connect_to_remote_hosts
is False
) and using this will result in an UnsupporteOperation
.
get_control_connection_host
()[source]¶
Returns the control connection host metadata.
refresh_schema_metadata
(max_schema_agreement_wait=None)[source]¶
Synchronously refresh all schema metadata.
By default, the timeout for this operation is governed by max_schema_agreement_wait
and control_connection_timeout
.
Passing max_schema_agreement_wait here overrides max_schema_agreement_wait
.
Setting max_schema_agreement_wait <= 0 will bypass schema agreement and refresh schema immediately.
An Exception is raised if schema refresh fails for any reason.
refresh_keyspace_metadata
(keyspace, max_schema_agreement_wait=None)[source]¶
Synchronously refresh keyspace metadata. This applies to keyspace-level information such as replication and durability settings. It does not refresh tables, types, etc. contained in the keyspace.
See refresh_schema_metadata()
for description of max_schema_agreement_wait
behavior
refresh_table_metadata
(keyspace, table, max_schema_agreement_wait=None)[source]¶
Synchronously refresh table metadata. This applies to a table, and any triggers or indexes attached to the table.
See refresh_schema_metadata()
for description of max_schema_agreement_wait
behavior
refresh_user_type_metadata
(keyspace, user_type, max_schema_agreement_wait=None)[source]¶
Synchronously refresh user defined type metadata.
See refresh_schema_metadata()
for description of max_schema_agreement_wait
behavior
refresh_user_function_metadata
(keyspace, function, max_schema_agreement_wait=None)[source]¶
Synchronously refresh user defined function metadata.
function
is a cassandra.UserFunctionDescriptor
.
See refresh_schema_metadata()
for description of max_schema_agreement_wait
behavior
refresh_user_aggregate_metadata
(keyspace, aggregate, max_schema_agreement_wait=None)[source]¶
Synchronously refresh user defined aggregate metadata.
aggregate
is a cassandra.UserAggregateDescriptor
.
See refresh_schema_metadata()
for description of max_schema_agreement_wait
behavior
refresh_nodes
(force_token_rebuild=False)[source]¶
Synchronously refresh the node list and token metadata
force_token_rebuild can be used to rebuild the token map metadata, even if no new nodes are discovered.
An Exception is raised if node refresh fails for any reason.
set_meta_refresh_enabled
(enabled)[source]¶
Deprecated: set schema_metadata_enabled
token_metadata_enabled
instead
Sets a flag to enable (True) or disable (False) all metadata refresh queries. This applies to both schema and node topology.
Disabling this is useful to minimize refreshes during multiple changes.
Meta refresh must be enabled for the driver to become aware of any cluster topology changes or schema updates.
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4