[Wactclc-alma] Regex in Alma Analytics

Thomas, Kirsti Kirsti.Thomas at seattlecolleges.edu
Tue Jan 7 11:20:20 PST 2020


Naoko,

I'm trying to figure out a way that allows for variable-length text.  I'm not the only person inputting this text, so there's a high probably that the wording won't be consistent from one field to another. I can't rely on filtering by exact phrases. :(

But since I didn't know how to create unions in Analytics, your response is incredibly helpful and I'll be saving it. :)

Kirsti

From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu> On Behalf Of Yasuda, Naoko
Sent: Monday, January 6, 2020 23:31
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu>
Subject: Re: [Wactclc-alma] Regex in Alma Analytics

Hi Kirsti,

I think you can use SUBSTRING and UNION to do this. The SQL syntax would be the following:

    SELECT SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 28)
    WHERE "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for all 3 colleges.%'
    UNION
    SELECT SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 36)
    WHERE "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for North and Central only.%'
    UNION
    SELECT SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 24)
    WHERE "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for South only.%'

In Analytics, you have to do this in steps and combine three reports:

1. In the "Electronic Collection"."Electronic Collection Internal Description" column:

    a. Add the following formula:
       SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 28)

    b. Add the following filter:
       "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for all 3 colleges.%'

2. Click the combine sets icon [cid:image001.png at 01D5C54C.721D5910] (see https://knowledge.exlibrisgroup.com/Alma/Product_Documentation/010Alma_Online_Help_(English)/080Analytics/050Common__Analytics_Procedures#Combining_Reports) and choose the subject area. I think the default is UNION, but check to see if UNION is selected.
 [cid:image002.png at 01D5C54C.721D5910]

3. Add all the columns that are in the original report.

4. In the "Electronic Collection"."Electronic Collection Internal Description" column:

    a. Add the following formula:
       SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 36)

    b. Add the following filter:
       "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for North and Central only.%'

5. Click the combine sets icon and choose the subject area.

6. Add all the columns that are in the original report.

7. In the "Electronic Collection"."Electronic Collection Internal Description" column:

    a. Add the following formula:
       SUBSTRING("Electronic Collection"."Electronic Collection Internal Description", 1, 24)

    b. Add the following filter:
       "Electronic Collection"."Electronic Collection Internal Description" LIKE 'Licensed for South only.%'

8. Click the Results tab.

I hope this makes sense and possibly works!

Naoko
Highline College

________________________________
From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu<mailto:wactclc-alma-bounces at lists.ctc.edu>> on behalf of Thomas, Kirsti <Kirsti.Thomas at seattlecolleges.edu<mailto:Kirsti.Thomas at seattlecolleges.edu>>
Sent: Friday, January 3, 2020 1:14 PM
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu<mailto:wactclc-alma at lists.ctc.edu>>
Subject: Re: [Wactclc-alma] Regex in Alma Analytics

CAUTION: This email originated from outside Highline College. Do not click links or open attachments unless you validate the sender and know the content is safe. If you are unsure, contact the Help Desk at x4357 (206-592-4357) or email helpdesk at highline.edu<mailto:helpdesk at highline.edu>.

I found a handy page on Advanced Analytics Formulas at Harvard (I'm so glad those folks have migrated to Alma-Primo! Their online documentation is amazing!) with some helpful examples:


Extracting a subfield (column formula)

Basic formula

EVALUATE('regexp_substr(%1, %2, 1, 1)', [Insert Alma Column SQL], '\$[subfield code][^\$]+')

The regexp_substr function has between two and six parameters: source string (in Analytics, the Alma column) and regular expression pattern, which are both required, and the optional start position, occurrence, match, and regular expression sub-expression parameters. Here %1 is for the Alma column and %2 is for the regular expression, which are both at the end of the formula. The first 1 is for the starting position. The second 1 is for the first occurrence.

The regular expression searches for the subfield delimiter and code ($f, for example) followed by one or more characters that are not a dollar sign ($), meaning from the start of the subfield until the end of that subfield.

Example: Extracting first subfield f from the Holding Local Param 03 (920 reporting field)

EVALUATE('regexp_substr(%1, %2, 1, 1)', "Holding Details"."Holding Local Param 03", '\$f[^\$]+')


https://wiki.harvard.edu/confluence/display/LibraryStaffDoc/Advanced+Formulas<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwiki.harvard.edu%2Fconfluence%2Fdisplay%2FLibraryStaffDoc%2FAdvanced%2BFormulas&data=02%7C01%7C%7Ceeda70279c5a41c3abb808d7934393ab%7C02d8ff38d7114e31a9156cb5cff788df%7C0%7C0%7C637139790770306828&sdata=H5bgJ1OW1Dag9%2BZFmFmdEHb4eRbRUf3QOjWkXdC85J0%3D&reserved=0>


When I attempt to build a regular expression based on this, however, I keep getting syntax errors in Alma Analytics. I've tried all of the following. Obviously I'm doing something wrong:

EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", 'Licensed[^\.]')
EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", 'Licensed[^\.]+')
EVALUATE('regexp_substr(%1, %2, 1, 1)', "Electronic Collection"."Electronic Collection Internal Description", '\.[^\.]+')

Wade, Sally and Dan-- Any ideas what I'm doing wrong?

Kirsti

-----Original Message-----
From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu<mailto:wactclc-alma-bounces at lists.ctc.edu>> On Behalf Of Thomas, Kirsti
Sent: Friday, January 3, 2020 11:51
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu<mailto:wactclc-alma at lists.ctc.edu>>
Subject: [Wactclc-alma] Regex in Alma Analytics

I'm trying to figure out how to remove all text, with the exception of a variable string, from the field display in an Analytics report.  I'm hoping you folks have some ideas.

Specifically, in an the Internal Description field for Electronic Collections, I have a sentence that says which library an electronic product is licensed for. Examples include:

Licensed for all 3 colleges.
Licensed for North and Central only.
Licensed for South only.


The sentence I want always begins with "Licensed for" then there's a variable-length text string that always ends with a period, followed by a variable amount of more text.

I want to get rid of any text that appears in front of "Licensed for" and any text that appears after the first period that shows up to the right of "Licensed for"

When I'm done, the field should only contain the sentence "Licensed for ... ."

Does that make sense?

I'm guessing I need to use LOCATE and SUBSTRING commands to do this, but I can't quite figure out how to make this work when the length of the phrase I want to end up with is variable.


Kirsti S. Thomas
Library Technical Service Manager
Seattle Colleges
kirsti.thomas at seattlecolleges.edu<mailto:kirsti.thomas at seattlecolleges.edu>





CAUTION: This email originated outside of the Seattle Colleges' email system. Do not click links or open attachments unless you recognize the sender and know the content is safe. Questions? Contact IT Services at x6333 (Central), x3630 (North), x5844 (South) or email ITHelp at seattlecolleges.edu<mailto:ITHelp at seattlecolleges.edu>.
CAUTION: This email originated outside of the Seattle Colleges' email system. Do not click links or open attachments unless you recognize the sender and know the content is safe. Questions? Contact IT Services at x6333 (Central), x3630 (North), x5844 (South) or email ITHelp at seattlecolleges.edu<mailto:ITHelp at seattlecolleges.edu>.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ctc.edu/pipermail/wactclc-alma_lists.ctc.edu/attachments/20200107/aacde50b/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 1626 bytes
Desc: image001.png
URL: <http://lists.ctc.edu/pipermail/wactclc-alma_lists.ctc.edu/attachments/20200107/aacde50b/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 11030 bytes
Desc: image002.png
URL: <http://lists.ctc.edu/pipermail/wactclc-alma_lists.ctc.edu/attachments/20200107/aacde50b/attachment-0005.png>


More information about the Wactclc-alma mailing list