Metadata Structure
For a comprehensive grasp of the metadata structure utilized in Apache ShardingSphere, a closer examination of the cluster mode of ShardingSphere-Proxy can be beneficial. The metadata structure in ZooKeeper adopts a three-layer hierarchy, with the first layer being the governance_ds. This layer encompasses critical components such as metadata information, built-in metadata database, and simulated MySQL database.
governance_ds
–metadata (metadata information)
—-sharding_db (logical database name)
——active_version (currently active version)
——versions
——–0
———-data_sources (underlying database information)
———-rules (rules of logical database, such as sharding, encryption, etc.)
——schemas (table and view information)
——–sharding_db
———-tables
————t_order
————t_single
———-views
—-shardingsphere (built-in metadata database)
——schemas
——–shardingsphere
———-tables
————sharding_table_statics (sharding statistics table)
————cluster_information (version information)
—-performance_schema (simulated MySQL database)
——schemas
——–performance_schema
———-tables
————accounts
—-information_schema (simulated MySQL database)
——schemas
——–information_schema
———-tables
————tables
————schemata
————columns
————engines
————routines
————parameters
————views
—-mysql
—-sys
–sys_data (specific row information of built-in metadata database)
—-shardingsphere
——schemas
——–shardingsphere
———-tables
————sharding_table_statistics
————–79ff60bc40ab09395bed54cfecd08f94
————–e832393209c9a4e7e117664c5ff8fc61
————cluster_information
————–d387c4f7de791e34d206f7dd59e24c1cThe metadata directory serves as a repository for storing essential rules and data source information, including the currently active metadata version, which is stored under the active_version node. Meanwhile, the versions stored within the metadata directory house different iterations of rules and database connection details.
Data Collection
The ShardingSphere’s integrated metadata database relies on data collection to aggregate information into memory and synchronizes it with the governance center to ensure consistency across clusters. To illustrate the process of data collection into memory, let’s use the sharding_table_statistics table as an example. The ShardingSphereDataCollectorinterface outlines a method for data collection:
Optional<ShardingSphereTableData> collect(String databaseName, ShardingSphereTable table, Map<String, ShardingSphereDatabase> shardingSphereDatabases) throws SQLException;
}
The aforementioned method is invoked by the ShardingSphereDataCollectorRunnable scheduled task. The current implementation initiates a scheduled task on the Proxy for data collection, utilizing the built-in metadata table to differentiate data collectors for specific data collection tasks. It is worth noting that based on feedback from the community, this approach may evolve into an e-job trigger method for collection in the future. The logic for collecting information is encapsulated in the ShardingStatisticsTableCollectorclass. This class employs the underlying data source and sharding rules to query relevant database information and extract statistical data.
Query Implementation
Upon completion of the data collection process, the ShardingSphereDataScheduleCollector class compares the collected information and the data stored in memory. In the event of any inconsistencies, it triggers an event EVENTBUSto notify the governance center. Subsequently, upon receiving the event, the governance center updates the information of other nodes and executes memory synchronization accordingly. The code for the event listening class is depicted below:
private final ShardingSphereDataPersistService persistService;
private final GlobalLockPersistService lockPersistService;
public ShardingSphereSchemaDataRegistrySubscriber(final ClusterPersistRepository repository, final GlobalLockPersistService globalLockPersistService, final EventBusContext eventBusContext) {
persistService = new ShardingSphereDataPersistService(repository);
lockPersistService = globalLockPersistService;
eventBusContext.register(this);
}
@Subscribe
public void update(final ShardingSphereSchemaDataAlteredEvent event) {
String databaseName = event.getDatabaseName();
String schemaName = event.getSchemaName();
GlobalLockDefinition lockDefinition = new GlobalLockDefinition(“sys_data_” + event.getDatabaseName() + event.getSchemaName() + event.getTableName());
if (lockPersistService.tryLock(lockDefinition, 10_000)) {
try {
persistService.getTableRowDataPersistService().persist(databaseName, schemaName, event.getTableName(), event…