+How to make an object type searchable?
+by Neophytos Demetriou (k2pts\@cytanet.com.cy)
+
+Making an object type searchable involves three steps:
+
+- Choose the object type
- Implement FtsContentProvider
- Add triggers
+
Choose the object type
+In most of the cases, choosing the object type is straightforward.
+However, if your object type uses the content repository then you
+should make sure that your object type is a subclass of the
+"content_revision" class. You should also make sure all content is
+created using that subclass, rather than simply create content with
+the "content_revision" type.
+
+- Object types that don't use the CR, can be specified using
+
acs_object_type__create_type
, but those that use the
+CR need to use content_type__create_type
.
+content_type__create_type
overloads
+acs_object_type__create_type
and provides two views
+for inserting and viewing content data, and the CR depends on these
+views. - Whenever you call content_item__new, call it with
+'content_revision' as the item_subtype and 'your_content_type' as
+the content_type.
+
Implement FtsContentProvider
+FtsContentProvider is comprised of two abstract operations, namely
+datasource
and url
. The specification for
+these operations can be found in
+packages/search/sql/postgresql/search-sc-create.sql
.
+You have to implement these operations for your object type by
+writing concrete functions that follow the specification. For
+example, the implementation of datasource
for the
+object type note
, looks like this:
+ad_proc notes__datasource {
+ object_id
+} {
+ \@author Neophytos Demetriou
+} {
+ db_0or1row notes_datasource {
+ select n.note_id as object_id,
+ n.title as title,
+ n.body as content,
+ 'text/plain' as mime,
+ '' as keywords,
+ 'text' as storage_type
+ from notes n
+ where note_id = :object_id
+ } -column_array datasource
+
+ return [array get datasource]
+}
+
+When you are done with the implementation of
+FtsContentProvider
operations, you should let the
+system know of your implementation. This is accomplished by an SQL
+file which associates the implementation with a contract name. The
+implementation of FtsContentProvider
for the object
+type note
looks like:
+select acs_sc_impl__new(
+ 'FtsContentProvider', -- impl_contract_name
+ 'note', -- impl_name
+ 'notes' -- impl_owner_name
+);
+
+You should adapt this association to reflect your implementation.
+That is, change impl_name
with your object type and
+the impl_owner_name
to the package key. Next, you have
+to create associations between the operations of
+FtsContentProvider
and your concrete functions. Here's
+how an association between an operation and a concrete function
+looks like:
+select acs_sc_impl_alias__new(
+ 'FtsContentProvider', -- impl_contract_name
+ 'note', -- impl_name
+ 'datasource', -- impl_operation_name
+ 'notes__datasource', -- impl_alias
+ 'TCL' -- impl_pl
+);
+
+Again, you have to make some changes. Change the
+impl_name
from note
to your object type
+and the impl_alias
from notes__datasource
+to the name that you gave to the function that implements the
+operation datasource
.
+Add triggers
+If your object type uses the content repository to store its items,
+then you are done. If not, an extra step is required to inform the
+search_observer_queue of new content items, updates or deletions.
+We do this by adding triggers on the table that stores the content
+items of your object type. Here's how that part looks like for
+note
.
+create function notes__itrg ()
+returns opaque as $$
+begin
+ perform search_observer__enqueue(new.note_id,'INSERT');
+ return new;
+end;
+$$ language plpgsql;
+
+create function notes__dtrg ()
+returns opaque as $$
+begin
+ perform search_observer__enqueue(old.note_id,'DELETE');
+ return old;
+end;
+$$ language plpgsql;
+
+create function notes__utrg ()
+returns opaque as $$
+begin
+ perform search_observer__enqueue(old.note_id,'UPDATE');
+ return old;
+end;
+$$ language plpgsql;
+
+
+create trigger notes__itrg after insert on notes
+for each row execute procedure notes__itrg ();
+
+create trigger notes__dtrg after delete on notes
+for each row execute procedure notes__dtrg ();
+
+create trigger notes__utrg after update on notes
+for each row execute procedure notes__utrg ();
+
Questions & Answers
+- Q: If content is some binary file (like a pdf file stored in
+file storage, for example), will the content still be
+indexable/searchable?
+A: For each mime type we require some type of handler. Once the
+handler is available, i.e. pdf2txt, it is very easy to incorporate
+support for that mime type into the search package. Content items
+with unsupported mime types will be ignored by the indexer.
+ - Q: Can the search package handle lobs and files?
+A: Yes, the search package will convert everything into text based
+on the content and storage_type attributes. Here is the convention
+to use while writing the implementation of datasource:
+- Content is a filename when storage_type='file'.
- Content is a lob id when storage_type='lob'.
- Content is text when storage_type='text'.
+
+
+
+