[Wactclc-alma] Regex in Alma Analytics

Thomas, Kirsti Kirsti.Thomas at seattlecolleges.edu
Fri Jan 3 13:14:54 PST 2020


I found a handy page on Advanced Analytics Formulas at Harvard (I'm so glad those folks have migrated to Alma-Primo! Their online documentation is amazing!) with some helpful examples: 


Extracting a subfield (column formula)

Basic formula

EVALUATE('regexp_substr(%1, %2, 1, 1)', [Insert Alma Column SQL], '\$[subfield code][^\$]+')

The regexp_substr function has between two and six parameters: source string (in Analytics, the Alma column) and regular expression pattern, which are both required, and the optional start position, occurrence, match, and regular expression sub-expression parameters. Here %1 is for the Alma column and %2 is for the regular expression, which are both at the end of the formula. The first 1 is for the starting position. The second 1 is for the first occurrence.

The regular expression searches for the subfield delimiter and code ($f, for example) followed by one or more characters that are not a dollar sign ($), meaning from the start of the subfield until the end of that subfield.

Example: Extracting first subfield f from the Holding Local Param 03 (920 reporting field)

EVALUATE('regexp_substr(%1, %2, 1, 1)', "Holding Details"."Holding Local Param 03", '\$f[^\$]+')


https://wiki.harvard.edu/confluence/display/LibraryStaffDoc/Advanced+Formulas


When I attempt to build a regular expression based on this, however, I keep getting syntax errors in Alma Analytics. I've tried all of the following. Obviously I'm doing something wrong:

EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", 'Licensed[^\.]')
EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", 'Licensed[^\.]+')
EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", '\.[^\.]+')

Wade, Sally and Dan-- Any ideas what I'm doing wrong?

Kirsti

-----Original Message-----
From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu> On Behalf Of Thomas, Kirsti
Sent: Friday, January 3, 2020 11:51
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu>
Subject: [Wactclc-alma] Regex in Alma Analytics

I'm trying to figure out how to remove all text, with the exception of a variable string, from the field display in an Analytics report.  I'm hoping you folks have some ideas.

Specifically, in an the Internal Description field for Electronic Collections, I have a sentence that says which library an electronic product is licensed for. Examples include:

Licensed for all 3 colleges.
Licensed for North and Central only.
Licensed for South only.


The sentence I want always begins with "Licensed for" then there's a variable-length text string that always ends with a period, followed by a variable amount of more text.

I want to get rid of any text that appears in front of "Licensed for" and any text that appears after the first period that shows up to the right of "Licensed for"

When I'm done, the field should only contain the sentence "Licensed for ... ."

Does that make sense?

I'm guessing I need to use LOCATE and SUBSTRING commands to do this, but I can't quite figure out how to make this work when the length of the phrase I want to end up with is variable.


Kirsti S. Thomas
Library Technical Service Manager
Seattle Colleges
kirsti.thomas at seattlecolleges.edu





CAUTION: This email originated outside of the Seattle Colleges’ email system. Do not click links or open attachments unless you recognize the sender and know the content is safe. Questions? Contact IT Services at x6333 (Central), x3630 (North), x5844 (South) or email ITHelp at seattlecolleges.edu.



More information about the Wactclc-alma mailing list