Tagged: index

Coveo Computed Field for Retrieving Page Content and Renderings in Sitecore

I needed to index all text from the page content and its associated renderings, including single-line, multi-line, and rich text, into a single Coveo-computed field.

Let’s get started.

Text Extraction:

  • Created a TextExtraction class that inherits BaseComputedField.
  • GetComputedFieldValue: Computes the concatenated text from the item and its renderings.

  • GetRenderingSource: Retrieves the source item for a given rendering reference.

  • GetDatasourceItem: Resolves and retrieves the data source item using the pipeline manager.

  • GetAllReferencedText: Extracts text from fields and adds them to a result list.

  • GetReferenceFieldData: Handles reference fields and extracts text from referenced items.

 

 

Configuration:

Let’s add the TextExtraction computed field into the config file.

 

I published all the files, and it’s time to check.

I selected the page that has many renderings and hit Rebuild Tree (I set the indexing strategy as SyncMasterIf you have intervalAsyncMaster or onPublishEndSyncSingleInstance, publish the item to see the record in Index.)

If any issues, you can put a breakpoint in Visual Studio, and Rebuild Tree will hit the breakpoint to debug.

 

Let’s check the Coveo index – Yay! Its page content and its associated rendering content were extracted successfully.

Hope this helps.

Happy Sitecoring!

0

Algolia Search Provider for Sitecore 10.2.1

Algolia Search Provider for Sitecore 10.2.1

The Algolia Search Provider for Sitecore is a powerful integration combining robust content management capabilities with Algolia’s lightning-fast, full-text search engine.

This provider enables Sitecore-driven websites and applications to deliver instant, relevant, and highly customizable search experiences to end-users, ensuring faster content discovery and improved user engagement.

While exploring, I found that Dmitry Harnitski and Peter Procházka have implemented the Algolia Search Provider for Sitecore 9.1 and previous versions. Thank you so much!

https://github.com/dharnitski/Sitecore.Algolia

https://github.com/chorpo/Sitecore.Algolia/tree/sitecore91

Upgrading to Sitecore 10.2.1

The following are the steps taken for upgrading Sitecore 10.2.1 – 

  •  I forked from Peter Procházka’s Sitecore 9.1 branch

https://github.com/chorpo/Sitecore.Algolia/tree/sitecore91

  • Upgraded the Sitecore NuGet packages to 10.2.1.

Sitecore-Algolia-Search-Provider-10-2-1-Upgrade-1.png

  • Algolia.Search NuGet package upgraded 6.12.1

Sitecore-Algolia-Search-Provider-10-2-1-Upgrade-4.png

Sitecore-Algolia-Search-Provider-10-2-1-Upgrade-2.png

  • Finally, able to build the solution successfully. 

Sitecore-Algolia-Search-Provider-10-2-1-Upgrade-3.png

  • After the build was successful, I shifted the module to the existing solution.
  • I didn’t get a chance to upgrade the unit test project. If you have upgraded, please let me know.

The module is checked into the following repository.

https://github.com/Madhumidha/Sitecore.Algolia/tree/Sitecore1021

The repo has all the instructions on how to set up and configuration details.

Hope this helps.

Happy Sitecoring!

 

1

Sitecore Forms : Custom Submit Action to Create and Publish Content Item (Part 2)

In Part 1, we saw

  • Create Custom Submit Action in Sitecore
  • Create a code-behind class that inherits SubmitActionBase
  • Create a secure API that can create and publish an item

https://madhuanbalagan.com/sitecore-forms-custom-submit-action-to-create-and-publish-item-from-form-data-part-1

Now, let’s focus on 

  • Create a separate Authorization in the Identity Server for Forms 
  • Create a Controller to bridge between Custom Submit Action and API call
  • Set the Custom Submit Action fields

1. Create a separate Authorization in the Identity Server for Forms 

Add the FormsServerClient node and its value to the Identity Server’s Sitecore.IdentityServer.DevEx.xml file. After making the change, make sure to IIS Reset for the change to be in effect. 

Note: Don’t forget to add transforms for ClientSecret!

 

Ideally, you can call it in Postman to verify that it’s generating a token using Form’s ClientID and ClientSecret values mentioned in the Sitecore.IdentityServer.DevEx.xml file.

Now that the config is all set to have its own ClientID and ClientSecret, let’s set up the Authorize method with BearerToken specifically for Form’s Identity Credentials.

The SitecoreRestServices handles authenticated HTTP requests to Sitecore’s REST APIs, managing access tokens and retrying requests if authentication fails. It uses dependency injection to configure the HTTP client and obtain the necessary settings.

2. Create a Controller to bridge between Custom Submit Action and API call

Let’s bridge the Custom Submit Action and API call – The CustomController class extends SitecoreController and uses dependency injection to obtain an instance of ICreateAutoPublishService.

It defines a CreateAndPublish method – HTTP POST endpoint that processes CreateAndAutoPublishModel, calls the service to create and publish content, and returns a JSON response indicating success or failure.

The method includes error handling to return appropriate HTTP status codes and messages for different exceptions.

3. Set the Custom Submit Action fields

The final step – Let’s set the Model Type and Error Message based on the class we created. 

Publish the form and its related items. It should be good to go!

Hope this helps.

Happy Sitecoring!

0

Coveo Computed Field for Extracting PDFs with Apache Tika in Sitecore

I needed to index the PDF file content in the Media Library, but when I tried to index it, the PDFSharp library couldn’t extract it.

Sitecore recommends using the following libraries: IFilter, Apache Tika, or SolrCell for indexing the media content.

I had a detailed blog on installing and integrating Apache Tika into the project.

https://madhuanbalagan.com/sitecore-apache-tika-integration-for-secure-media-file-indexing

Now that Tika is integrated, let’s get started on creating a computed field to extract PDF content using the Apache Tika service.

Media Extraction:

  • I created MediaExtraction class, which inherits BaseComputedField.
  • The GetComputedField method calls the ApacheTika service and extracts the text asynchronously.
  • Returns the text document.

 

 

Apache Tika Service:

  • The Tika Service class implements the IContentExtractionService interface
  • The main method ReadJsonObject sends the document to the Tika server and extracts the text content parsed JSON response.

 

 

Tika ConnectionString:

Please make sure that Tika is up and running.

<add name=”tika” connectionString=”http://localhost:9998″ />

When I checked, it wasn’t running for some reason.

 

Run the following Powershell script to restart the Tika.

cd c:\tika

java -jar tika-server-1.22.jar -s

Let’s check – Tika is now up and running.

 

Configuration:

Let’s add the MediaExtraction computed field into the config file.

I published all the files and it’s time to check.

I selected the PDF document in the Media Library and hit Rebuild Tree (I set the indexing strategy as SyncMaster. If you have intervalAsyncMaster or onPublishEndSyncSingleInstance, publish the item to see the record in Index.)

 

 

 

Let’s check the Coveo index – Yay! Its PDF content was extracted successfully.

Sitecore-Media-Computed-Field

 

The same computed field would work for Word and PowerPoint documents as well.

Hope this helps.

Happy Sitecoring!

0

Sitecore Forms : Custom Submit Action to Create and Publish Content Item(Part 1)

 

Problem

I encountered a situation that the website was internally facing, which needed a form that would not save the data in the Experience Forms database. Also the data needs to be saved as an Item in the master database so it could be published and consumed in a listing page.

At first, I considered creating a Custom Submit Action to create and publish an item. However, I later realized that the CD server lacked access to the CM server, preventing it from directly generating an item in the master database. So, I came up with the idea of creating a custom API service and combining it with a custom submit action.

Solution

To implement this followed this process

  • Create Custom Submit Action in Sitecore
  • Create a code-behind class that inherits SubmitActionBase
  • Create a secure API that can create and publish an item
  • Create an API authorization user in the identity server 
  • Create a Controller to proxy between Custom Submit Action and API calls
  • Set the Custom Submit Action fields

 

1. Custom Submit Action in Sitecore

Create the custom submit action in the following path /sitecore/system/Settings/Forms/Submit Actions.

Sitecore_Forms_1.png

 

2. Create a code-behind class that inherits SubmitActionBase

Create a class that inherits the SubmitActionBase class and overrides the Execute method which calls the API asynchronously to create and publish an item.

The CreateAndAutoPublish class extends SubmitActionBase<string> and is designed to handle form submissions in Sitecore, creating and auto-publishing content items.

It uses dependency injection to obtain instances of IHttpClientFactoryBaseSettings, and ISitecoreRestServices, which are lazily initialized. The ExecuteAction method validates the form submission context and parameters, then calls the Execute method to prepare a model from form fields, serialize it to JSON, and send it to a Sitecore API endpoint using an HTTP POST request.

The class includes helper methods to extract values from form fields and handles errors by logging them and adding them to the form submission context’s error collection.

 

3. Create an API call that creates and publishes the item  

The API call does the following things – 

  1. Creates an item under a specified folder
  2. Move and Sync the item into a bucket
  3. Publish the item
  4. Clear the Sitecore cache
  5. Ensure the published item is indexed or force index the item
  6. Clear the Vercel Cache for frontend

In Part 2, we will see how to

  • Create an API authorization user in the identity server 
  • Create a Controller to proxy between Custom Submit Action and API calls
  • Set the Custom Submit Action fields

Hope this helps.

Happy Sitecoring!

1

2024 – Contributions

The 2024 year has been great with presenting at Sitecore Symposium, and various Sitecore User Groups worldwide. 

Symposium Presentation: 

 

Blog Posts : 

 

Presentations : 

 

Contributions: 

  • Dec 22, 2023 – Feedback/Review for Sitecore Content Hub Administrator Certification Exam – 
  • Dec 15, 2023Feedback for Sitecore MVP Program – 
  • Mar 01, 2024 – Sitecore CDP & Personalize Certification Exam Beta review
  • April 19, 2024 – Reviewed Sitecore Dev 10 Beta Exam

 

 Co-organized SUG-Pittsburgh Meetups: 

 

YouTube – Videos published for SUG – Pittsburgh  

  • Jan 02, 2024Abstracting personalization – CP
  • Mar 20, 2024Sitecore XM Cloud 101 – A Beginner’s Guide
  • Apr 03, 2024From Brainstorm to Brilliance: The Untold Story of Sitecore Hackathon
  • Apr 12, 2024Sitecore XM Cloud Forms
  • May 6, 2024Achieving Behavioral Personalization with XM Cloud Plus 
  • May 22, 2024Sitecore Search and XM Cloud – Experiences and Learnings
  • June 11, 2024Tips and Tricks for Next.js and Sitecore Headless
  • Aug 5, 2024 Content Hub Web Client SDK – Internal & External Integrations
  • Sep 3, 2024Sitecore Troubleshooting 101
  • Oct 8, 20248 Content Migration Pitfalls to Avoid and Lessons Learned

 

Co-organized SUG-Philadelphia Meetups: 

 YouTube – Videos published for SUG – Philadelphia 

 

Conferences I attended: 

  • Symposium 
  • MVP Summit 
  • MVP Lunches   
  • MVP Webinars 
  • All SUG Pittsburgh and Queen City meetups  
  • Many SUG Boston/ Columbus/ Atlantic meetups   

 

Plans for 2026: 

Learn and contribute: 

  • Composable DXP 
  • Sitecore Saas Offerings 
  • Containerization 

Co-organize : 

  • Monthly SUG-Pittsburgh meetup 
  • Quarterly SUG-Philadelphia meetup 

Present: 

  • SUGCON EU/Symposium 
  • SUG Meetups 

Happy New Year! Happy Sitecoring!

1

Sitecore: Apache Tika Integration for Secure Media File Indexing

Sitecore-Apache-Tika-Integration-Secure-Media

Problem

We have some secure PDFs in Media Library that were not getting indexed in Solr – They couldn’t be extracted using the PDFSharp library.

The logs were showing the error while extracting secure files

16804 12:04:53 ERROR DefaultMediaItemTextExtractor: Cannot extract content from media item with id ‘{442006A5-8CB6-4ABE-8855-786D2A870201}’.
Exception: PdfSharp.Pdf.IO.PdfReaderException
Message: The PDF document is protected with an encryption not supported by PDFsharp.
Source: PdfSharp
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.ValidatePassword(String inputPassword)
at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider passwordProvider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider provider)
at Sitecore.ContentSearch.ContentExtraction.Readers.PdfSharpReader.ReadAll(String filePath)
at Sitecore.ContentSearch.ContentExtraction.Common.DefaultMediaItemTextExtractor.ExtractTextFromMedia(MediaItem mediaItem)

38536 12:04:53 ERROR DefaultMediaItemTextExtractor: Cannot extract content from media item with id ‘{442006A5-8CB6-4ABE-8855-786D2A870201}’.
Exception: PdfSharp.Pdf.IO.PdfReaderException
Message: The PDF document is protected with an encryption not supported by PDFsharp.
Source: PdfSharp
at PdfSharp.Pdf.Security.PdfStandardSecurityHandler.ValidatePassword(String inputPassword)
at PdfSharp.Pdf.IO.PdfReader.Open(Stream stream, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider passwordProvider)
at PdfSharp.Pdf.IO.PdfReader.Open(String path, String password, PdfDocumentOpenMode openmode, PdfPasswordProvider provider)
at Sitecore.ContentSearch.ContentExtraction.Readers.PdfSharpReader.ReadAll(String filePath)
at Sitecore.ContentSearch.ContentExtraction.Common.DefaultMediaItemTextExtractor.ExtractTextFromMedia(MediaItem mediaItem)

 

Solution

  • If you like to index the media content, Sitecore recommends using the following libraries IFilter, Apache Tika, or SolrCell. 

https://www.searchstax.com/docs.hc/can-we-use-apache-tika

  • Azure web apps have a limitation in using the IFilter library so I ended up using Apache Tika.

Steps to Integrate:

  • Download the Apache Tika server file –tika-server-1.22.jar.
    • Sitecore recommends Apache Tika version 1.22 refer to the compatibility table for your version
  • Save the server file in a folder on SOLR server e.g: c:\tika
  • In PowerShell navigate to the path and execute the following command to install.

java -jar tika-server-1.22.jar

Sitecore-Apache-Tika-Solr-Indexing.png

Note: The default hostname is localhost and the port is 9998.

If you would like a specific hostname and port number that could be included in the installation command as parameters

 java -jar tika-server-1.22.jar –host=<Tikahostname> –port=<portnumber>

After the installation is completed open the following URL http://localhost:9998 to see if it is working as expected. You should see the welcome message!

Sitecore-Apache-Tika-Solr-Indexing-2.png

  • Add the following patch file into App_Config/Include/zzz folder to replace DefaultMediaFileTextExtractor from Sitecore.ContentSearch.ContentExtration.

 

 

  • Last step – Let’s add Tika URL into ConnectionStrings.config file.

<add name=”tika” connectionString=”http://localhost:9998″ />

  • Let’s test quickly – Rebuild a Tree in the Developer Ribbon for one item or you could Rebuild the entire index.
  • Once the indexing is completed check and see if we have the media item available in the index.

Quick Tip: To search for a particular item in Solr, use the following query in the parameter q on your index page

_uniqueid:*[item id in lowercase without braces]*

 Sitecore-Apache-Tika-Solr-Indexing-3.png

Hope this helps.

Happy Sitecoring!

2

Coveo for Sitecore: Index Error Troubleshooting and Resolution

Coveo-Sitecore-p_ApiKey-Error-Title-Image.png

I installed Coveo 5.0.1039.1 on the Sitecore 10.1 instance locally. 

After Coveo activation, the indexes weren’t loading and threw an error.

Coveo-Sitecore-p_ApiKey-Error.png

The logs showed ‘The parameter ‘p_ApiKey’ must not be an empty string’ error.

Exception: System.ArgumentException Message: Precondition failed: The parameter 'p_ApiKey' must not be an empty string Parameter name: p_ApiKey Source: Coveo.Framework at Coveo.Framework.CNL.Precondition.RaiseArgumentException(String p_Message, String p_ParameterName) at Coveo.Framework.CNL.Precondition.NotEmpty(String p_Parameter, String p_ParameterName) at Coveo.CloudPlatformClientBase.Communication.CloudPlatformHttpClientFactory.CreateAuthorizedJsonHttpClient(String p_ApiKey) at Coveo.CloudPlatformClientBase.CloudPlatformClient..ctor(CloudPlatformConfiguration p_Configuration, ICloudPlatformHttpClientFactory p_CloudPlatformHttpClientFactory, IPipelineRunnerHandler p_PipelineRunnerHandler, ISerializer p_Serializer, ICoveoSettings p_CoveoSettings, IStaticTTLCacheFactory`2 p_StaticTTLCacheFactory, ICriticalExceptionHandler p_CriticalExceptionHandler) at Coveo.CloudPlatformClientBase.CloudPlatformClient..ctor(CloudPlatformConfiguration p_Configuration) at Coveo.CloudPlatformClientBase.Communication.CloudPlatformClientFactory.GetCloudPlatformClient(CloudPlatformConfiguration p_Configuration) at Coveo.SearchProvider.Licensing.CloudLicenseRetriever.GetCloudLicense() at Coveo.SearchProvider.Licensing.CloudLicenseRetriever.GetLicense(Boolean p_ForceRetrieve) at Coveo.SearchProvider.Licensing.Cloud.LicenseRetriever.GetLicense(Boolean p_ForceRetrieve) at Coveo.SearchProvider.Licensing.LicenseManager.RetrieveLicense(Boolean p_ForceUpdate) at Coveo.SearchProvider.Licensing.LicenseManager.EnsureValidLicense() at Coveo.SearchProvider.Licensing.LicenseManager.GetLicenseInformation() at Coveo.SearchProvider.Rest.SitecoreRestHttpHandler.InitializeLicenseSettings() at Coveo.SearchProvider.Rest.SitecoreRestHttpHandler.OnInitializeSettings() at Coveo.Search.Api.Proxy.ProxyHttpHandler.OnInitialize() at Coveo.Search.Api.Proxy.ProxyHttpHandler.EnsureInitialized() at Coveo.Search.Api.Proxy.ProxyHttpHandler.ProcessRequest(IHttpContext p_Context) at Coveo.SearchProvider.Rest.SitecoreRestHttpHandlerDispatcher.ProcessRequest(HttpContext p_Context) at System.Web.HttpApplication.CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() at System.Web.HttpApplication.ExecuteStepImpl(IExecutionStep step) at System.Web.HttpApplication.ExecuteStep(IExecutionStep step, Boolean& completedSynchronously)

After researching, came to know there are two API Keys in Coveo.CloudPlatformClient.Custom.config needs to match with Coveo Platform.

1. apiKey
2. searchApiKey

Coveo-Sitecore-p_ApiKey-Error-4.png

When I logged into the platform, the keys were not visible since it was secure (not sure where I saved it!), decided to create new keys.

Coveo-Sitecore-p_ApiKey-Error-2.png
1. SearchApiKey

To create the Search API Key, we must ensure the correct permissions are in place.

Ensure that Impersonate -> Allowed is selected to limit the scope of the API Key, which can be selected from the drop-down list.

Coveo-Sitecore-p_ApiKey-Error-3.png
2. ApiKey

To create the ApiKey, we need to set multiple privileges.

Content Tab:

    1. Fields -> Edit
    2. Security Identities -> Edit
    3. Security Identity Providers -> Edit
    4. Sources -> Edit all

Coveo-Sitecore-p_ApiKey-Error-5.png

Organization Tab:

    1. Organization -> Edit

Coveo-Sitecore-p_ApiKey-Error-6.png

Search Tab:

    1. Search Page -> View all

Coveo-Sitecore-p_ApiKey-Error-6.png

When the keys are created, make sure to save them in a secure place!

Coveo-Sitecore-p_ApiKey-Error-8.png
It is time to update the new config keys. 

Modify the apiKey and secureApiKey values in Coveo.CloudPlatformClient.Custom.config under AppConfig/Include/Coveo folder

Coveo-Sitecore-p_ApiKey-Error-4.png

Let’s reload Coveo Index Manager and no more errors.

Indexes are loaded and rebuilt successfully. Yay!

Coveo-Sitecore-p_ApiKey-Error-9.png

Hope this helps.

Happy Sitecoring!

References:
https://docs.coveo.com/en/2484/coveo-for-sitecore-v5/activate-silently#creating-the-api-keys

2

Coveo for Sitecore: Troubleshooting and Diagnostics

Coveo’s Diagnostics page is super helpful when troubleshooting any Coveo issues. It is listed in the Coveo Search menu in the Sitecore control panel or can be reached directly with the following url –

https://[CMS Site]/sitecore modules/web/coveo/admin/coveodiagnosticpage.aspx

 

Coveo for Sitecore components state

This section shows the status of all services related to Coveo. Here is the healthy state, but when it errors it shows the detailed error message

Healthy Component State

 

Errors in Component State

Coveo-Sitecore-Control-Panel-Diagnostic

 

Coveo for Sitecore version information

It comes in handy for checking Coveo and Sitecore versions and their compatibility.

Current Coveo for Sitecore version: 5.0.1153.1

Current Sitecore version: 10.2.0.6766

Compatibility status: these versions are compatible

 

Coveo for Sitecore organization information

This section is about the Organization and it is usage details.

 

Coveo for Sitecore configuration files

Basically shows all Coveo-related config files that are currently loaded in the system.

 

Coveo for Sitecore published items

It shows if the Coveo-related Sitecore items are published or not, it’s time to publish them 🙂

 

Coveo for Sitecore Indexing test

This section comes into handy when indexing an item or a path, really helpful when an Item has been published but it’s not available in the Coveo index. 

 

 

Coveo for Sitecore log viewer

This section is my favorite – I typically use it on Production environments to view the logs when we don’t have access to the server or without logging into the Production servers. We can quickly view the log and troubleshoot the issues.

 

Indexes List

It shows all the indexes and the IsCoveo flag differentiates the Coveo and Sitecore Indexes.

 

 

Download Diagnostics Package

Another super helpful tool – The download Diagnostics Package button at the top of the page. 

It creates all necessary config and logs files needed to log a Coveo Support ticket.

I hope this helps someone.

Happy Sitecoring!

1

Install and Configure Coveo for Sitecore

Install and Configure Coveo for Sitecore

I’ll be installing Coveo 5.0 on a Sitecore 10.2 in the following post. Let’s explore –

1. Download

Choose the Coveo package based on the Sitecore version and download the relevant package.

https://docs.coveo.com/en/2274/coveo-for-sitecore-v5/releases-and-downloads 

Coveo-Install-Sitecore-Download

2. Install

Upload and install using the Sitecore Installation Wizard

Coveo-Install-Sitecore-Installation-Wizard

Accept the customer agreement.

 

 

3. Activation

Once the installation is complete, Sitecore will show the following popup.

Let’s explore how to activate Coveo for Sitecore.

Coveo-Install-Sitecore-Activation-Authorization

The login page brings to https://platform.cloud.coveo.com/login

Coveo-Install-Sitecore-Installation-Plarform-Login

I logged in using a Google account for Demo purposes.

 

Coveo-Install-Sitecore-Installation-Wizard-Accept-Customer-Agreement

Coveo-Install-Sitecore-Installation-Plarform-Grant-Access

Coveo-Install-Sitecore-Activation-Authorization-Successful

4. Configure

After authorization is successful, create your Organization with Name and Organization Type.

  • Enterprise Trial
  • Pro Trial

https://www.coveo.com/en/pricing/sitecore-integration

Coveo-Install-Sitecore-Activation-New-Organization

Use the default Indexing options.

Coveo-Install-Sitecore-Activation-New-Organization-Index-Options

 Provide Farm name and Sitecore credentials and click Activate.

Coveo-Install-Sitecore-Activation-New-Organization-Farm-Configuration

Once Activation is completed, Rebuild all Coveo Indexes to finish the setup.

I hope this helps.

Happy Sitecoring!

1