Elasticsearch: Anatomy of a Search

debugging search Aug 28, 2018

Dan Iverson

Search is becoming more important in PeopleSoft and is often required for applications to work properly. We’ve talked before on how to configure Elasticsearch with PeopleSoft, but what happens when we perform a search? There are a number of moving parts that handle the search query, applying security and displaying the results. Let’s take a look into the anatomy of a search to understand the process.

Overview of the Search Flow

The user creates a search request. Generally, this happens when a user opens the Search Box and enters a term or terms to search on. But this process can also start from a component. A good example is the “Find Learning” page in ELM. When you open the component, it runs sends a query to Elasticsearch for data to populate the page with.

Search Framework Builds the Query

The search request from the user needs to be parsed and formatted so that Elasticsearch understands the query. The app package PT_SEARCH has a number of methods that check for wildcards, partial words, and create a valid JSON request for Elasticsearch.

IB Builds JSON Message

In the app package PTSF_ES, the JSON search query string is converted into an IB Message using the IB_GENERIC message. The app package also builds the HTTP headers for the transaction, such as the content-type: application/json and the Authorization header.

The Integration Broker POSTs the message against the alias for the Search Category. If you deployed the EP_ASSETS search in Finance for the database PSFTDB, the alias is ep_assets_psftdb_orcl_es_alias. When you search, you search against the Search Category instead of the individual indices. The alias is the Elasticsearch mechanism to let you search against the category instead of each index.

Elasticsearch Validates Security

A PeopleSoft plugin running in Elasticsearch verifies the security of the user request before running the query. The orcl_es_auth plugin will validate the Basic Authentication value in the HTTP POST from the IB. The user it is validating is esadmin – the user and password defined on the Search Instance page. In your Elasticsearch logs, you can see the validation happening:

[authentication           ] Authentication plugin : authenticate method
[authentication           ] Authentication type : Base64
[authentication           ] Authentication successful

ES Runs the Query

Next, Elasticsearch will run the query. In the Elasticsearch logs, you can see which indexes are used to perform the search:

[orclacl                  ] Indices/ aliases : [ ep_assets_psftdb_orcl_es_alias ]
[orclacl                  ] Index types retrieved : [ ep_am_asset_psftdb ]

In this example, the EP_ASSETS Search Category has only one index associated with it.

ES Checks the ACL Cache

After Elasticsearch has the search results for the query, the results are filtered based on PeopleSoft security. Search Definitions can implement different security types to lock down search data. There is User/Role security, as well as Document security. If a Search Definition has a security type, the security data is collected when the Search Index is built with PTSF_GENFEED. The security attributes are attached to each row of data sent to Elasticsearch.

If the index has security enabled, Elasticsearch needs to know what security the user running the query has access to. The orcl_es_acl plugin is responsible for managing the user security data inside Elasticsearch. If the user has run a query in the last two hours, the orcl_es_acl plugin uses the existing security data for the user stored in Elasticsearch.

In the Elasticsearch logs, you can see this in action:

[orclacl                  ] Document is secured for the type : ep_am_asset_psftdb
[orclacl                  ] Orcl ACL Plugin method: getAttrValFromCache

ES Performs Callback (if needed)

If the user has not run a query for the index before, the plugin will perform a callback to PeopleSoft. The callback will ask PeopleSoft what security values the user has and stores them in a special index: orcl_es_acl. To perform the callback, Elasticsearch creates a new cURL request to the IB URI /RESTListeningConnector/PSFTDB/getsecurityvalues.v1/. The callback URL and security is stored in Elasticsearch under the index orcl_es_acl

The request includes three parameters as well:

  • ?type=ep_am_asset_psftdb
  • ?user=PS
  • ?attribute=BU_SECURITY_SES

The type parameter is the name of the search index requesting the security data. The user is the name of the user running the search query. And the attribute is the security value configured on the Search Definition page.

PS Runs the Callback Code

When the IB receives the callback request, it hands the request to the Application Package specified on the Search Definition. The App Package inspects the parameters to see what security attribute has been requested. The callback code then runs any appropriate code and SQL to build a list of Roles or Permission Lists assigned to the user.

PS Returns the ID and Security

The Permission Lists or Roles are assembled into an array and have P: or R: appended to the beginning of each item. Once the array is assembled, the values are returned to Elasticsearch in a response message. For example, the response might contain an array that looks like ["P:PTPT1000","P:PTPT1200","P:PTPT3100"] – Permission Lists assigned to that user.

ES Filters the Results

Elasticsearch has no concept of the security structures in PeopleSoft, so it uses the data returned from the callback to remove search results that a user has no security to. The callback array from above lists the Permission Lists assigned to a user. Elasticsearch will build a filter query to remove any rows from the search results that do not contain one of those Permission Lists.

Something to keep in mind – Elasticsearch can only filter the search results based on the Permission List stored in each searh result. This means that Permission Lists are attached to each search result when the PTSF_GENFEED process is run. If you change your security dramatically (say, re-implement Business Unit Security), you may need to fun a full build to push your security changes into Elasticsearch.

You can view the permission lists attached to each document with the Elasticsearch API. In your browser you can call the /_search REST method against any URL. If you deployed the PTPORTALREGISTRY search definition with the default settings, you can call the URL http://elastic.psadmin.io:9200/ptportalregistry_psftdb_orcl_es_alias/_search?pretty=true to view the data in Elasticsearch. In the document, you will find the list of security attributes like this: ["P:PTPT1000","S:Admin"].

ES Returns the Results

After the filter query completes, Elasticsearch will return the search results as a JSON message back to PeopleSoft. When the message is received, the results are parsed and displayed by the search results component. Depending on the application, you might see the generic search results page, or an application specific page.

 


Note: This was originally posted by Dan Iverson and has been transferred from a previous platform. There may be missing comments, style issues, and possibly broken links. If you have questions or comments, please contact [email protected].