Searching with Odin

How to filter search results?

11min

Vector search in Filters

Just as you can use vector search to create initial search results, you can also use it to filter them. The Vector Search filter allows you to filter search results by using words, phrases, sentences, or even entire paragraphs. Unlike Keyword search, it does not search for particular words or sentences but rather uses your input to filter the dataset further based on meaning. For example, if your initial search query was "car seat", you might enter "artificial leather" into the vector search filter to filter on patent families that describe artificial leather car seats without the normal limitations associated with keyword search. Please remember to press Enter after typing to save your tag. If you hit Enter after typing a keyword, it will turn into a blue box, such as in the screenshot below. Please note that the maximum input is 384 words, any input after 384 words will not be vectorized and will thus not affect the filtering outcome.

Document image


Positive Use the positive field to indicate what you do want in your search results. When using Positive search, multiple entries will be interpreted as having an AND relationship. In other words, multiple entries into the Positive search field will include patents that are similar to Input A AND Input B Negative Use the negative field to indicate what you don't want in your search results. A Negative search will interpret multiple entries as having an OR relationship. In other words, multiple entries into the Negative search field will exclude patents that are similar to Input A OR Input B.

Keyword search 

There is an include and an exclude option in the keyword filtering section. Type a keyword in include or exclude and hit enter to save it. If you do not hit enter to save the keyword, Odin will not consider it. If you hit Enter after typing a keyword, it will turn into a blue box, such as in the screenshot below.

Document image


Include Every keyword you include MUST be present in the patent family to still show up in your search results. If you add multiple keywords in include, the default relationship between the keywords is AND. In other words, if you add "bike" and "tire" as keywords, only patent families that contain both words will be displayed in your results. Exclude In exclude, the default relationship between keywords is OR. Any patent family that contains an excluded keyword will be ignored. In other words, if you exclude "saddle" and "gears" all patent families that contain either word will be excluded. The logic behind separating the default relationship between include and exclude keywords is that you typically want to zoom in on a particular concept in the dataset with include, and want to exclude many more aspects with exclude. Patent family segments to match against Below include/exclude you can select which segments of the patent families to match your keyword filters with. Keyword search is completed in the patent sections such as Invention title, Abstract, Claims, and Description. Typically, patent subjects will be mentioned in the Title, Abstract, or Claims. 💡To improve search results, uncheck "Description" as this section often contains many words that are not directly associated with the topic of the invention itself.

Document image


Custom BOOLEAN filtering You can create custom Boolean search strings in the keyword filters by typing bool: before entering your keyword. For example, you could type bool: bike OR tyre in include. This will ignore the standard AND operator between keywords in include and gives you more freedom to create custom filters. Other relevant BOOLEAN operators: Use * to introduce a wild card and to replace zero or more characters. For example: combus* will also include results that mention combust or combustion. Using ? allows to replace a single character. For example te?t - This can match test, tent, text, etc. ❗Be aware that wildcard queries can use an enormous amount of memory and perform very badly — just think how many terms need to be queried to match the query string "a* b* c*". () is useful when you want to make sure that 2 words are mentioned after each other. For example, use (bike tire) if you only want to include results that mention the word tire right after bike. Use "" for a phrase query. For example applying "high catalytic activity of carbonic anhydrase"expects to search for the patents where all of the words are in exactly the same order. A proximity query allows the specified words to be further apart or in a different order. Using ~ allows us to specify a maximum edit distance of words in a phrase. For example "machine learning"~5 will search for machine and learning within 5 words of each other.

Document image


Publication Numbers

The Publication Numbers filter helps to find specific patents in the search results using their publication number. The standard patent publication format in Odin is: "2 digit country code-patent number-kind code". For example: "US-20180226168-A1".

Document image


Organization

To filter the search results by organizations, use the Organization filter. There are two ways to use this filter: by selecting/deselecting an organization from the list, or by entering a specific name in the organization search field and then selecting or deselecting that organization.  Searching for an organization's name will allow you to select that particular organization's portfolio within your search set. Simply tick/untick which organizations you want to in or exclude and hit 'Filter' to update your dataset.

Document image


Patent Offices

If you are only interested in patent families with members in specific patent offices, use the patent office filter. Similar to organizations, there are two ways to use it: selecting/deselecting a patent office from the list, or searching for a specific patent office and then selecting/deselecting it. Hit 'Filter' when you have applied your selection to update your dataset.

Document image


Status

If you only want to see patent families that have members with a certain status, you can filter by status. Open the status filter and tick/untick which statuses you want to include in your dataset. There are 5 statuses: 'Active', 'In-Force', 'Pending', 'Inactive', and 'Unknown'.

Document image


In Force families have at least 1 member that is currently granted and still in force.

Pending families are those where all patent family members are currently still pending.

Active families are those which are either in force or pending.

Inactive families are those where all patent family members have expired for any reason.

Unknown families are those where we do not have good information from the patent office about the current state of the patent family members, and thus the family as a whole.

Hit 'Filter' to update your dataset when you have applied your selection.

Publication year

With the publication year filter, you can select a pre-determined or custom publication year date range for your dataset. Standard filters are: Last year, last 3 years, last 5 years, last 10 years, and last 20 years. You can also add a custom date range by entering the year range (e.g. 2010-2017) you want to apply to your data set. Hit 'Filter' to update your dataset when you have entered your selection.

Document image


Expiration year

With the expiration year filter, you can select a pre-determined or custom expiration year date range for your dataset. Standard filters are: Next year, next 3 years, next 5 years, next 10 years, and next 20 years. You can also add a custom date range by entering the year range (e.g. 2025-2030) you want to apply to your dataset. Hit 'Filter' to update your dataset when you have entered your selection.

Document image


Similarity 

The Similarity (%) filter allows you to set a similarity threshold to your dataset. E.g. the minimum similarity score to still be included in your search results. It can be adjusted by setting a specific number in the input field or by simply dragging the bar.   The height of the bars on the chart indicates the number of patents at each similarity level.

Document image


For example, a similarity threshold of 83,7 will only include patent families in your search results if they are at least 83,7% similar to your query. Similarity thresholds are an important part of vector search. Since vector search works through grouping and including similar documents to your query, at some point you will start getting less desirable results. For example, if you search for 'dog food', your top results are likely to all contain dog food patent family members. However, at some point further down the similarity list, there may be 'cat food' patent families, as cat food is at least somewhat related to dog food. To determine the appropriate similarity threshold for your search results, you might consider scrolling through your list of results to identify a similarity score at which your results start getting irrelevant. If you identify such a threshold, this would be a good number to place your similarity threshold.