[Wactclc-alma] Norm Rule to remove rule <U+fffd>

Guidry, Wade WadeG at bigbend.edu
Thu Mar 24 17:00:43 PDT 2022


Thanks, Lesley.

Sorry, I could have just done an all titles search for <U+fffd>.

I’ll take a look and see if I can’t get a solution to you as a going away present 😊

Also, just some background, <U+fffd> is a generic Unicode replacement character used when an incoming character can’t be identified.

This issue probably stems from a mismatch between the encoding of the incoming record and the encoding specified in the import profile or configuration.

I see 172 records with this character, of various ages. But some where created yesterday or today, like this one: 992512785402818

I don’t see any import jobs. Was this record pushed into Alma via Connexion, maybe?

Wade Guidry
Library Consortium Services Manager, WACTCLC
wadeg at bigbend.edu<mailto:wadeg at bigbend.edu>
(509) 760-4474
http://www.wactclc.org<http://www.wactclc.org/>

From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu> On Behalf Of Lesley Caldwell
Sent: Thursday, March 24, 2022 4:07 PM
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu>
Subject: Re: [Wactclc-alma] Norm Rule to remove rule <U+fffd>

CAUTION: Originated outside our network. Do not click links or open attachments unless you validate the sender.


MMS ID: 992513085002818

From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu<mailto:wactclc-alma-bounces at lists.ctc.edu>> On Behalf Of Guidry, Wade
Sent: Thursday, March 24, 2022 3:28 PM
To: WACTCLC Alma Discussion <wactclc-alma at lists.ctc.edu<mailto:wactclc-alma at lists.ctc.edu>>
Subject: Re: [Wactclc-alma] Norm Rule to remove rule <U+fffd>

CAUTION: This email originated from outside Pierce College. Do not click links or open attachments unless you have confirmed this is an authentic message from sender and know the content is safe. If you are unsure, contact the Help Desk at 253-964-6373 or email Helpdesk at pierce.ctc.edu<mailto:Helpdesk at pierce.ctc.edu>
Lesley, can you share the MMSID of an example record?



Wade Guidry
Library Consortium Services Manager, WACTCLC
wadeg at bigbend.edu<mailto:wadeg at bigbend.edu>
(509) 760-4474
http://www.wactclc.org<https://linkprotect.cudasvc.com/url?a=http%3a%2f%2fwww.wactclc.org%2f&c=E,1,rintnvg7a2mA7kL6FsZcy-ZbjtmLjsRq1GTNK4tEt_uZp1jWbFseW37tFAJwiXW0ww9Lhj1vE9dSxvKTR6vFT-jRuabc2Ddf-xDd-7fKsXrwn241CZ_Y&typo=1>

From: Wactclc-alma <wactclc-alma-bounces at lists.ctc.edu<mailto:wactclc-alma-bounces at lists.ctc.edu>> On Behalf Of Lesley Caldwell
Sent: Thursday, March 24, 2022 2:35 PM
To: wactclc-alma at lists.ctc.edu<mailto:wactclc-alma at lists.ctc.edu>
Subject: [Wactclc-alma] Norm Rule to remove rule <U+fffd>

CAUTION: Originated outside our network. Do not click links or open attachments unless you validate the sender.


Hi all,

One last question. :) I’m working on another normalization rule that we can’t quite get right! Victoria found this indication rule<https://linkprotect.cudasvc.com/url?a=https%3a%2f%2fdevelopers.exlibrisgroup.com%2fblog%2falma-indication-rule-for-bad-diacritics%2f&c=E,1,s94jd0nGScrqxwy_ndxouim4zR8zNOfCA-L_ywUETRgJ5K7tFU5zWurCiyzNLcbaGaO3H5RbGNknPPhHWyBderifsnGy_Q_I7Q_ur48q756l3OLW_zqKAqY,&typo=1>, but we don’t just want to find the diacritics, we also want to remove them.

We’ve tried:

rule "Contains <U+fffd> in 0XX or 1XX or 2xx or 3xx or 4xx or 5xx or 6XX or 7XX or 8xx fields"
when
     ((exists "0**.*.*<U+fffd>*") OR (exists "1**.*.*<U+fffd>*") OR (exists "2**.*.*<U+fffd>*") OR (exists "3**.*.*<U+fffd>*") OR (exists "4**.*.*<U+fffd>*") OR (exists "5**.*.*<U+fffd>*") OR (exists "6**.*.*<U+fffd>*") OR (exists "7**.*.*<U+fffd>*") OR (exists "8**.*.*<U+fffd>*"))
then
replaceContents "<U+fffd>" with ""
end

And:

rule "Contains <U+fffd> in 0XX or 1XX or 2xx or 3xx or 4xx or 5xx or 6XX or 7XX or 8xx fields"
when
     (exists "<U+fffd>")
then
replaceContents "<U+fffd>" with ""
end

Does anything wrong stick out to anyone?

Thanks!

Lesley Caldwell

Systems and Instruction Librarian

Pierce College

p:

253-840-8420

a:

1601 39th Ave SE, Puyallup, WA 98374

w:

www.pierce.ctc.edu<http://www.pierce.ctc.edu/> e: LCaldwell at pierce.ctc.edu<mailto:LCaldwell at pierce.ctc.edu>

[pierce college logo]





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ctc.edu/pipermail/wactclc-alma_lists.ctc.edu/attachments/20220325/12fe3907/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 3749 bytes
Desc: image001.png
URL: <http://lists.ctc.edu/pipermail/wactclc-alma_lists.ctc.edu/attachments/20220325/12fe3907/attachment.png>


More information about the Wactclc-alma mailing list