Apache Geode CHANGELOG

Handling Missing Disk Stores

This section applies to disk stores that hold the latest copy of your data for at least one region.

Show Missing Disk Stores

Using gfsh, the show missing-disk-stores command lists all disk stores with most recent data that are being waited on by other members.

For replicated regions, this command only lists missing members that are preventing other members from starting up. For partitioned regions, this command also lists any offline data stores, even when other data stores for the region are online, because their offline status may be causing PartitionOfflineExceptions in cache operations or preventing the system from satisfying redundancy.

Example:

gfsh>show missing-disk-stores
          Disk Store ID              |   Host    |               Directory                                           
------------------------------------ | --------- | -------------------------------------
60399215-532b-406f-b81f-9b5bd8d1b55a | excalibur | /usr/local/gemfire/deploy/disk_store1

Note: You need to be connected to JMX Manager in gfsh to run this command.

Note: The disk store directories listed for missing disk stores may not be the directories you have currently configured for the member. The list is retrieved from the other running members—the ones who are reporting the missing member. They have information from the last time the missing disk store was online. If you move your files and change the member’s configuration, these directory locations will be stale.

Disk stores usually go missing because their member fails to start. The member can fail to start for a number of reasons, including:

  • Disk store file corruption. You can check on this by validating the disk store.
  • Incorrect cluster configuration for the member
  • Network partitioning
  • Drive failure

Revoke Missing Disk Stores

This section applies to disk stores for which both of the following are true:

  • Disk stores that have the most recent copy of data for one or more regions or region buckets.
  • Disk stores that are unrecoverable, such as when you have deleted them, or their files are corrupted or on a disk that has had a catastrophic failure.

When you cannot bring the latest persisted copy online, use the revoke command to tell the other members to stop waiting for it. Once the store is revoked, the system finds the remaining most recent copy of data and uses that.

Note: Once revoked, a disk store cannot be reintroduced into the system.

Use gfsh show missing-disk-stores to properly identify the disk store you need to revoke. The revoke command takes the disk store ID as input, as listed by that command.

Example:

gfsh>revoke missing-disk-store --id=60399215-532b-406f-b81f-9b5bd8d1b55a
Missing disk store successfully revoked