Domain Index Operators CaveatsΒΆ

The domain index always does a full table scan and returns all hits. When using compound query that contains a common substructure, the full table scan will be inefficient. Consider the following:

select count(id)
  from nci_open
 where c$arnachm1.substructure(smiles, 'c1ccccc1', -1) = 1
   and id between 170000 and 171000;

The domain index search returns almost 200000 hits and these are filtered by the id clause to give about 600 hits. It would be more efficient to get approximately 1000 hits from the id clause first and then perform substructure search on those. Unfortunately, the Oracle cartridge implementation allows only a full table scan or a single row test. Each single row test would require a call to the RMI server and is impractical for 1000 rows.

To obtain better performance, use SQL that returns ROWIDS to the domain index using tableIndexSubstructSqlFilter:

select count(id)
   from nci_open m
  where rowid in
       (select * from
              ('ARNACHM1_TEST', 'NCI_OPEN', 'SMILES',
               'select rowid from arnachm1_test.nci_open where id between 170000 and 171000',
                      'c1ccccc1', -1)));

There is an analogous function tableIndexSimilaritySqlFilter for similarity searches.

Previous topic

Using Domain Index Operators

Next topic

Functional Operators