Elasticsearch: Anatomy of a Search
Aug 28, 2018Dan Iverson
Search is becoming more important in PeopleSoft and is often required for applications to work properly. We’ve talked before on how to configure Elasticsearch with PeopleSoft, but what happens when we perform a search? There are a number of moving parts that handle the search query, applying security and displaying the results. Let’s take a look into the anatomy of a search to understand the process.
Overview of the Search Flow
The user creates a search request. Generally, this happens when a user opens the Search Box and enters a term or terms to search on. But this process can also start from a component. A good example is the “Find Learning” page in ELM. When you open the component, it runs sends a query to Elasticsearch for data to populate the page with.
Search Framework Builds the Query
The search request from the user needs to be parsed and formatted so that Elasticsearch understands the query. The app package PT_SEARCH
has a number of methods that check for wildcards, partial words, and create a valid JSON request for Elasticsearch.
IB Builds JSON Message
In the app package PTSF_ES
, the JSON search query string is converted into an IB Message using the IB_GENERIC
message. The app package also builds the HTTP headers for the transaction, such as the content-type: application/json
and the Authorization
header.
The Integration Broker POST
s the message against the alias for the Search Category. If you deployed the EP_ASSETS
search in Finance for the database PSFTDB
, the alias is ep_assets_psftdb_orcl_es_alias
. When you search, you search against the Search Category instead of the individual indices. The alias is the Elasticsearch mechanism to let you search against the category instead of each index.
Elasticsearch Validates Security
A PeopleSoft plugin running in Elasticsearch verifies the security of the user request before running the query. The orcl_es_auth
plugin will validate the Basic Authentication value in the HTTP POST
from the IB. The user it is validating is esadmin
– the user and password defined on the Search Instance page. In your Elasticsearch logs, you can see the validation happening:
[authentication ] Authentication plugin : authenticate method
[authentication ] Authentication type : Base64
[authentication ] Authentication successful
ES Runs the Query
Next, Elasticsearch will run the query. In the Elasticsearch logs, you can see which indexes are used to perform the search:
[orclacl ] Indices/ aliases : [ ep_assets_psftdb_orcl_es_alias ]
[orclacl ] Index types retrieved : [ ep_am_asset_psftdb ]
In this example, the EP_ASSETS
Search Category has only one index associated with it.
ES Checks the ACL Cache
After Elasticsearch has the search results for the query, the results are filtered based on PeopleSoft security. Search Definitions can implement different security types to lock down search data. There is User/Role security, as well as Document security. If a Search Definition has a security type, the security data is collected when the Search Index is built with PTSF_GENFEED
. The security attributes are attached to each row of data sent to Elasticsearch.
If the index has security enabled, Elasticsearch needs to know what security the user running the query has access to. The orcl_es_acl
plugin is responsible for managing the user security data inside Elasticsearch. If the user has run a query in the last two hours, the orcl_es_acl
plugin uses the existing security data for the user stored in Elasticsearch.
In the Elasticsearch logs, you can see this in action:
[orclacl ] Document is secured for the type : ep_am_asset_psftdb
[orclacl ] Orcl ACL Plugin method: getAttrValFromCache
ES Performs Callback (if needed)
If the user has not run a query for the index before, the plugin will perform a callback to PeopleSoft. The callback will ask PeopleSoft what security values the user has and stores them in a special index: orcl_es_acl
. To perform the callback, Elasticsearch creates a new cURL
request to the IB URI /RESTListeningConnector/PSFTDB/getsecurityvalues.v1/
. The callback URL and security is stored in Elasticsearch under the index orcl_es_acl
The request includes three parameters as well:
- ?type=ep_am_asset_psftdb
- ?user=PS
- ?attribute=BU_SECURITY_SES
The type
parameter is the name of the search index requesting the security data. The user
is the name of the user running the search query. And the attribute
is the security value configured on the Search Definition page.
PS Runs the Callback Code
When the IB receives the callback request, it hands the request to the Application Package specified on the Search Definition. The App Package inspects the parameters to see what security attribute has been requested. The callback code then runs any appropriate code and SQL to build a list of Roles or Permission Lists assigned to the user.
PS Returns the ID and Security
The Permission Lists or Roles are assembled into an array and have P:
or R:
appended to the beginning of each item. Once the array is assembled, the values are returned to Elasticsearch in a response message. For example, the response might contain an array that looks like ["P:PTPT1000","P:PTPT1200","P:PTPT3100"]
– Permission Lists assigned to that user.
ES Filters the Results
Elasticsearch has no concept of the security structures in PeopleSoft, so it uses the data returned from the callback to remove search results that a user has no security to. The callback array from above lists the Permission Lists assigned to a user. Elasticsearch will build a filter query to remove any rows from the search results that do not contain one of those Permission Lists.
Something to keep in mind – Elasticsearch can only filter the search results based on the Permission List stored in each searh result. This means that Permission Lists are attached to each search result when the PTSF_GENFEED
process is run. If you change your security dramatically (say, re-implement Business Unit Security), you may need to fun a full build to push your security changes into Elasticsearch.
You can view the permission lists attached to each document with the Elasticsearch API. In your browser you can call the /_search
REST method against any URL. If you deployed the PTPORTALREGISTRY
search definition with the default settings, you can call the URL http://elastic.psadmin.io:9200/ptportalregistry_psftdb_orcl_es_alias/_search?pretty=true
to view the data in Elasticsearch. In the document, you will find the list of security attributes like this: ["P:PTPT1000","S:Admin"]
.
ES Returns the Results
After the filter query completes, Elasticsearch will return the search results as a JSON message back to PeopleSoft. When the message is received, the results are parsed and displayed by the search results component. Depending on the application, you might see the generic search results page, or an application specific page.
Note: This was originally posted by Dan Iverson and has been transferred from a previous platform. There may be missing comments, style issues, and possibly broken links. If you have questions or comments, please contact [email protected].