The SQL Dude!

On-Disk & In-Memory Pages that teach you the SQL way of life!

Getting a custom IFilter working with SQL Server 2008/R2 (IFilterSample)

Posted by Sudarshan Narasimhan on July 14, 2011


For those of you not familiar with my past work, I am a moderator of the official SQL Server Full-Text Search blog on MSDN and I recently posted there an article about custom IFilter and SQL Server. I am attaching a sample of the project I downloaded and built from Windows SDK v6.1 which works with SQL Server 2008 and R2.

Background

In this post I will cover how to get a custom IFilter working on SQL Server 2008/R2 (64-bit or 32-bit). For the purpose of this demo, I am using the IFilterSample that is available with Windows Server 2008 SDK.  Once you have installed the SDK you can find the sample here à C:\Program Files\Microsoft SDKs\Windows\v6.1\Samples\winui\wsearch\IFilterSample

The IFilterSample implements a sample IFilter for a fictitious file format called “.filtersample” which is actually an XML file. This sample will work on Windows Vista and Windows 7.

 

Developers out there might already be familiar with using the IFilter interface to develop the filters. For those who aren’t you can start from here,


Windows Search Interfaces

http://msdn.microsoft.com/en-us/library/ee872063(v=VS.85).aspx


Data Add-in Interfaces

http://msdn.microsoft.com/en-us/library/ee872090(v=VS.85).aspx


IFilter Interface

http://msdn.microsoft.com/en-us/library/bb266451(v=vs.85).aspx

 


Building and Setting up the IFilter DLL

I am going to use IFilterSample as reference and must warn you this post is NOT a how-to-write-custom-IFilter lessonJ.


The IFilterSample is a native C++ COM based DLL and as part of its registration provides a CLSID that can be registered on the system. Here are the step-by-step instructions you came here for:-

 

1.       Download and install the Windows Server 2008 SDK on your machine.

2.       Copy the sample from C:\Program Files\Microsoft SDKs\Windows\v6.1\Samples\winui\wsearch\FilterSample, onto your desktop which is the copy we will use to build the project.

3.       Open the FilterSample.sln file and  in Additional Dependencies in Linker Properties, add xmllite.lib

clip_image002[4]

4.       Build the project for x64 or Win32 as appropriate (this would be dependent on the architecture of your SQL Server this IFilter will be used on).

5.       Once the project is built you will have FilterSample.DLL in your Debug or Release folder. Make sure that the SQL Server service account has FULL CONTROL permissions on this folder.

6.      From an elevated command prompt browse to the location of the DLL and register it using regsvr32 FilterSample.dll

clip_image004[4]

 

clip_image006[4]

 

7.       Now we need to create some registry keys in SQL Server to make it use this filter DLL.

a.       The keys I created below are for a default instance of SQL Server 2008, so change the InstanceID as appropriate.

b.      The CLSID given below is specific to this IFilterSample only.

c.       Notice that the file extension I provided is called “.filtersample”.

d.      The path provided has to match the path from where it was registered in step #6.

 

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSearch\Filters\.filtersample]
@="{6C337B26-3E38-4F98-813B-FBA18BAB64F5}"

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSearch\CLSID\{6C337B26-3E38-4F98-813B-FBA18BAB64F5}]
@="C:\\Users\\sudarn\\Desktop\\FilterSample\\Release\\FilterSample.dll"
"ThreadingModel"="Both"
"Flags"=dword:00000001

 

8.       Now, connect to the SQL instance as a sysadmin and run the following sequence of commands

 

EXEC sp_fulltext_service @action=‘load_os_resources’, @value=1;

EXEC sp_fulltext_service ‘verify_signature’, 0

EXEC sp_fulltext_service ‘update_languages’;

EXEC sp_fulltext_service ‘restart_all_fdhosts’;

 

9.       To verify the FilterSample.DLL has been successfully loaded and recognised by SQL Server, you can check the DMV sys.fulltext_document_types or use sp_help_fulltext_system_components ‘filter’

 

 

Testing the IFilter

That was all, folks! You can now do a test to verify that the search is working as expected. There are 2 ways of testing the IFilter.

 

1.      Using SQL Server

a.       Create a test table and enable it for full-text indexing.

CREATE TABLE [dbo].[testTBL](

      [doctype] [nvarchar](50) NOT NULL,

      [document] [varbinary](max) NULL,

      [docname] [varchar](50) NOT NULL,

 CONSTRAINT [PK_testTBL] PRIMARY KEY CLUSTERED

(

      [docname] ASC

)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

) ON [PRIMARY]

 

GO

CREATE FULLTEXT CATALOG [CatalogTEST] WITH ACCENT_SENSITIVITY = OFF
AUTHORIZATION
[dbo]

 

GO

CREATE FULLTEXT INDEX ON [dbo].[testTBL](

[docname] LANGUAGE [English],

[doctype] LANGUAGE [English],

[document] TYPE COLUMN [doctype] LANGUAGE [English])

KEY INDEX [PK_testTBL]ON ([CatalogTEST], FILEGROUP [PRIMARY])

WITH (CHANGE_TRACKING = AUTO, STOPLIST = SYSTEM)

GO

 

b.      Insert a sample document into the table. The IFilterSample comes with a file called “SampleFile.filtersample”, which can be used for test purposes.

 

INSERT INTO dbo.testTBL(doctype,docname,document)

SELECT ‘.filtersample’ AS doctype, ‘SampleFile.filtersample’ AS docname,

* FROM OPENROWSET(BULK N’C:\Users\sudarn\Desktop\FilterSample\SampleFile.filtersample’, SINGLE_BLOB) AS Document

 

c.       Perform a test search

select * from dbo.testTBL where contains(document, ’emergency’)

 

d.      You can also check to see the document was indexed by querying DMV,

SELECT * FROM sys.dm_fts_index_keywords(db_id(‘test’), object_id(‘dbo.testTBL));

 

2.      Using FILTDUMP.exe

 

Filtdump.exe is a sample tool that is available again in WinSDK and can be used to test a particular filter by loading it and then getting the text chunk out of any sample document. It is available in both 32-bit and 64-bit versions. It can be found here once you install WinSDK,

C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\x64\filtdump.exe

C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\filtdump.exe

 

For example,

C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\x64>filtdump.exe "C:\Users\s

udarn\Desktop\FilterSample\SampleFile.filtersample"

 

<snippet of the output>

———————————————————————-

Type = 31 (0x1f): AuthorXYZ

 

———————————————————————-

                                Attribute = B725F130-47EF-101A-A5F102608C9EEBAC\19

                                idChunk = 4

                                BreakType = 0 (No Break)

                                Flags(chunkstate) = 0x1 (Text)

                                Locale = 1033 (0x409)

                                IdChunkSource = 0

                                cwcStartSource = 0

                                cwcLenSource = 0

———————————————————————-

 this is a test of the emergency broadcasting system.

 

This is the content of the document.

 

I’m most pleased that it is the content of the document.

 

</snippet>

 

If you see the above then the IFilter is working fine as you are able to read it and get the text chunks out.

 

 

Possible Issue(s)

You might receive this error message in the SQLFT logs when trying to index a newly inserted row. This indicates that the IFilter was not recognised and there was an issue in loading it.

 

“Warning: No appropriate filter was found during full-text index population for table or indexed view ‘[test].[dbo].[Documents]’ (table or indexed view ID ‘1003150619’, database ID ‘6’), full-text key value ‘SampleFile.filtersample’. Some columns of the row were not indexed.”

 

(OR)

 

Error ‘0x8004170c: The document format is not recognized by the filter.’ occurred during full-text index population for table or indexed view ‘[test].[dbo].[Documents]’ (table or indexed view ID ‘1003150619’, database ID ‘6’), full-text key value ‘SampleFile.filtersample’. Failed to index the row.

 

 

The possible things to check for above error are:-

1.       Ensure that the registration and other steps given above are done in the same sequence as indicated.

2.       If you are using the Debug build ensure that the VCRuntime Debug dependency DLL is present (msvcr90d.dll or msvcr100d.dll).

3.       Use Process Monitor to ensure that your DLL is getting loaded

4.       Use Dependency Walker to ensure there are no other missing dependencies for your DLL.

5.       Verify that the DLL architecture (x64/x86) matches your SQL Server architecture. You cannot mix & match this.

 

I must point out one important thing here if you are going to use this DLL on other machines. You will need to have the Visual Studio Redistributable package installed on the target machine where you plan to deploy this. You will need the corresponding redistributable package for the VS version you used to build the DLL.


FilterSample VS2010 Project

You can download the FilterSample project I used for my testing from here. This was built using Visual Studio 2010 on Windows Server 2008 R2 x64. Just FYI – The Release and Debug DLL present in the attached project is x64. You can rebuild to x86 as per your requirements.

https://skydrive.live.com/?cid=2c5b6bdee139d310&sc=documents&uc=1&id=2C5B6BDEE139D310%21135# 

 

Have fun IFiltering and stay tuned for more on this blog.

Advertisements

4 Responses to “Getting a custom IFilter working with SQL Server 2008/R2 (IFilterSample)”

  1. Hi Sudarshan,

    Another good post from you. I am reading this blog very seriously and found usefull too. 🙂 Since you have expertise in Full Text Search I Have some question about FTS.

    I have a SQL Server 2005 cluster installed in Win 2003 32-bit machine. On passive node (Node 2) I upgraded the SQL 2005 to SQL Server 2008 R2 and then I failover my SQL instance from active node (Node 1) to passive node (Node 2). All the services are coming online on Node 2 except of “Full Text Service” in cluadmin. While the time of upgrade it gave me 3 options for Full Text Service a) Import b) Rebuild c) Reset. I selected the option b) Rebuild.

    Now when i am trying to bring online Full text service then it is failing. In the system eventviewer i am finding below info:-

    1. Cluster resource ‘SQL Server Fulltext (BBA1)’ in Resource Group ‘MMKBBA1’ failed.
    2. Cluster generic service ‘SQL Server Fulltext (BBA1)’ could not be found.

    In the Application eventviewer i am not getting any error. Can you please advice on this.
    Thanks.

    • Sudarshan Narasimhan said

      Thanks Gaurav.

      I think the issue you have reported is because from SQL 2008 onwards, FTS is part of the database engine and there is no longer any service (or cluster resource) called full-text search since it is part of sqlservr.exe. The “rebuild” option during the upgrade would have rebuilt your catalogs and these will now be a part of your database MDF file.

      The Upgrade should remove the fulltext resource since full-text resource is not required for SQL Server 2008 as FTS is part of the database engine. If you are left with the resource after upgrade, then most likely it was set as a dependency on SQL Server or something else, so setup couldn’t remove it. You can find out from Detail.txt, as to why this resource was not deleted.

      Removing dependent relationship on resource ‘SQL Server (BBA1)’ from resource ‘SQL Server Fulltext (BBA1)’

      You can safely remove FTS as a dependency and move/delete your full-text resource.

      • Hi Suadrshan,

        Thanks very much for your reply. As you said in SQL 2008, FTS is a part of database engine itself, I have 2 doubts here that 1. During the upgrade\installation it asked me a service account and password for FTS, why? It shouldn’t ask me if it’s part of DB engine.
        2. It’s not showing as resource in cluadmin then is it mean it is a service but not cluster aware service like integartion service?

      • Sudarshan Narasimhan said

        1. The service account/password asked during setup is for the filter daemon host service (FDHOST) and not for the full-text engine (the FTengine is part of the database engine). FDHOST is the one which loads filters and word-breakers. This link has more info on fdhost process
        http://msdn.microsoft.com/en-us/library/ms142541.aspx

        2. You’re right. FDHOST is a standalone service present on both the nodes and it is not cluster aware, which is why you don’t see a resource for this in cluadmin.

        To sum it up, the full-text indexing, crawling and index storage is now part of sqlservr.exe and indexes are stored in the database MDF files, but the filtering, word-breaking components are loaded in fdhost.exe which talks to sqlservr.exe to get the full-text work done.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: