From search to dataset
Patents analysis in GetFocus
12 min
searching for patents searching for patents imagine you want to find patents about a specific chemistry or technology for example, ammonia free compositions for hair dye to start building a patent landscape go to the search page select technology as the search type (want to use a different search type? read docid\ a76atae2twqap3j9qm0w5 ) enter your query in plain language for example “ammonia free hair dye composition ” run the search you’ll see a progress window while getfocus runs the search, then you’ll land on the results page search page 💡 build your search query using the formula \[ technology technology ] + \[ application application ] + \[ context context ] example a a mmonia free mmonia free hair dye hair dye for gray coverage for gray coverage getfocus uses vector based (semantic) search , which means you don’t need to write long boolean strings to get relevant results (new to vector search? read docid\ wnovei8wgmfert akhlmo ) one key difference from boolean search with boolean, broader terms usually return more results with natural language (semantic) search, adding specific context helps the system understand your intent, so more detail often leads to more relevant results adding relevant context is crucial for making ai search perform well the more clearly you describe what you’re looking for, the better getfocus can retrieve the right documents a full sentence works better than a list of keywords good example good example “ ammonia free hair dye using monoethanolamine + 4 amino 2 hydroxytoluene for 100% gray coverage ” why? well specified, with additional relevant context what not to do what not to do “hair dye” or “ammonia free” why? too broad and vague on its own it doesn’t say what product , which approach , or what performance goal you care about you’ll pull in a huge range of unrelated patents now that you’re on the results page, here are a few basics to know search details click the info icon (1) (1) to view your search type and query patent family count the side panel shows how many patent families are in your dataset (2) (2) sorted by relevance patents are ordered from most to least relevant, so your strongest matches appear at the top you can also see each patent’s relevance score for your search (3) (3) broad first results (recall over precision) initial results are intentionally broad, so you may see some noise at first filter for accuracy to sharpen the dataset, apply filters this is exactly what we’ll cover in the next step want to understand how each patent record is shown in getfocus? see article patent card filtering to find the right patents quickly filtering to find the right patents quickly after you run a search, you’ll get a patent dataset, but it will often include some noise by “noise,” we mean patents that are somewhat related (they use similar terms or address a neighboring problem) but aren’t truly in scope for what you’re trying to analyze that’s where filters come in your toolkit includes many different filters, and you can read more about each one in the article in the article how to filter search results? for this quickstart guide, we’ll keep things simple and focus on the llm filter it lets you describe what you want in natural language , so you can quickly remove noise and keep only the patents that match your topic click create a new llm filter in the instruction box, describe what you want to keep in plain language example instruction “only include patents about 100% ammonia free hair dye compositions for grey coverage that also have conditioning or nourishment benefits ” need help writing a good instruction? see https //docs getfocus eu/prompting like a pro#undefined llm filter that workundefined click create llm filter (or press enter ) to start getfocus will review each patent family and decide whether it matches your instruction anything that doesn’t match will be filtered out 💡 runs in the background llm filters process about 12 patent families per second and can keep running while you continue working in getfocus when you’re ready, come back and click apply filter when processing is finished, click apply filter to see the results ‼️one llm filter can process up to 10,000 patent families if your dataset is larger, getfocus will process the 10,000 highest relevance families first after you apply the llm filter, the dataset shrinks because patents that don’t match your criteria are filtered out extracting the insights extracting the insights now, getting to the insights when you’re ready to turn your dataset into answers, getfocus offers two ways to chat with your patents chat with set → ask questions across a dataset (up to 1,000 families ) chat with invention → deep dive into one patent in detail use chat with set for “what’s happening across the landscape?” and chat with invention for "what does this patent specifically say?” chat with set use chat with set to ask high level questions across a group of patents—trends, themes, and portfolio comparisons what it reads titles, abstracts, and claims (for all families in the set) best for spotting trends and how they change over time comparing portfolios across companies summarizing what’s in your filtered dataset how it works type your question, and your answer will be generated below if your dataset contains more than 1,000 families , chat with set will use a subset of 1,000 families to generate an answer 💡 if you want full control over which families are included, filter your dataset down to ≤ 1,000 before asking your question you can also apply common filters directly in chat with set (e g , publication year , status , patent office , ultimate owner ) chat with invention use chat with invention when you want detailed q\&a on a specific patent what it reads title, abstract, claims, and the full description (including images, if available) best for understanding a single invention thoroughly extracting implementation details, embodiments, or claim scope answering precise technical questions chat with invention includes references and cites sources at the bottom of its answer chat with invention supports two modes non reasoning mode best for quick, straightforward questions (fast responses) reasoning mode best for complex questions (takes more time, higher quality reasoning) what's next? what's next? you've built your dataset and have the tools to analyze it to go advanced, explore the articles and interactive walkthroughs below ⚖️ patentability/prior art search identify novelty of an idea and find prior art most closely related to the concept https //getfocus storylane io/share/jiahhkfjenia ⚖️ invalidity search identify patents that destroy the novelty of a patent you want to invalidate https //getfocus storylane io/share/f7dm0kkv6utc

