The way forward in package insert user tests from a cro's perspective

The Way Forward in Package Insert User Tests Introduction: A significant increase in the ty to locate information, thus putting people off amount of text in package inserts has been ob- from reading the contents (P 0.021). served over the last years. This study investigat- Discussion: The results suggest that decreasing ed the consequences of this increase. Method: the amount of text is a key factor, whereby a Five package inserts available on the German maximum of 1,500 words per package insert medicine market in 2002 and five developed should be the aim. Conclusion: The way forward Joerg Fuchs
model versions were investigated in a crossover in package insert user testing is to concentrate procedure using the written readability test. on patient requirements and package insert im- Results: The more extensive the package inserts, provements. Appropriate solutions are required the worse patients feel informed. Increasing the for the further development of guidelines, tem- amount of text significantly decreases the abili- Key Words
inserts that comply with the laws and guide- lines. Beyond these groups, there are the na- Since Directive 2004/27/EC came into force, tional and supranational agencies that monitor Correspondence Address
package inserts of medicines that are sold in the to ensure package inserts and their user tests European Union require user tests to ensure comply with legal requirements. Therefore, the that they are legible, clear, and easy to use (1).
task of a contract research organization (CRO) One possible way to carry out these tests is the is to support the pharmaceutical companies to verbal face-to-face interview method, developed develop and test package inserts that meet the by the Australian communications researchers requirements of each stakeholder group. This Sless et al. (2,3). Using this method, 12 to 15 EMEA-CMD(h) Workshopon User Testing, December questions concerning the package insert key which cannot always result in the most appro- messages are posed orally by an interviewer to a priate package insert for every patient.
minimum of 20 medical laymen divided into two In comparison to European package inserts, which are delivered to patients in the packaging of each over-the-counter and prescribed medi- with the guidelines and if they are able to detect cine, package inserts in the United States—also problems in locating and understanding infor- called prescription drug labels—are primarily mation (2). One example is the written readabil- written to give health care professionals the in- ity test, which was developed and validated in formation they need to appropriately prescribe the PAINT1 survey. In this method the test par- medicines (5). However, both the European and ticipants are given the instructions and a mini- US package inserts have grown over the years in mum of 15 questions regarding the package in- length, detail, and complexity. Therefore, appro- sert key messages using a questionnaire (4).
priate improvements and measures, such as user Different stakeholders are involved in package inserts and their associated user tests, although Currently, there are no published results patients form the key group in Europe. Patients available regarding what can be achieved with should be able to locate, understand, and use the package insert user tests required in the Eu- the information provided in the package in- ropean Union. Table 1 provides optimizations of serts. Other interested parties are the pharma- four selected aspects that one CRO company ceutical companies, who must create package has achieved using the written readability test Drug Information Journal, Vol. 44, pp. 119–129, 2010 • 0092-8615/2010 Submitted for publication: February 2, 2009 Printed in the USA. All rights reserved. Copyright 2010 Drug Information Association, Inc. Package Insert Optimizations Through the Most Recent User Tests Carried Out by PAINT-Consult Using the Written
Readability Test (N = 40, English and German Package Inserts)
Original Package
Final Package
Inserts (n)
Inserts (n)
Difference (%)
Long sentences (over 20 words per sentence) *All difficult words were explained.
method. On average, the number of difficult the PAINT1 survey results was done to investi- words in the first package inserts provided by gate the advantages and disadvantages of this the pharmaceutical companies are reduced, from 86 to 14 in the final tested versions. This isa difference of 84% (10). In addition, the 14 re- maining difficult words are always explained.
The PAINT1 study was a written readability test There are maximum and minimum results.
using a questionnaire consisting of 15 ques- tions relating to the package inserts’ key mes- clients, who have learned from the user test sages and 17 questions concerning the partici- company’s recommendations and test results to provide better package inserts. This is a clear Five original package inserts, available on the German medicine market, and five previously Another aspect is the amount of text. On aver- developed model versions were investigated age, the number of words contained in package from September 2002 to April 2003. The mod- inserts was reduced by 20% (Table 1) (10). One els contained the same information as the origi- survey with patients and another with medical nals; however, they were optimized using a set of and pharmaceutical experts showed that this quality criteria such that the amount of text was text reduction meets patients’ and experts’ re- reduced to one page of A4 paper, printed on quirements, as both want the text of package in- both sides. Furthermore, the models contained serts to be shorter and limited to only essential an optimized design and a larger font size of 11 points (11). However, the current situation indi- cates there has been a large text increase over The study was a crossover comprehensibility the last years, due to new directives, templates, test in two rounds with a break between each round of a minimum of 4 weeks. In the first about the medicines. Therefore, a subanalysis of round, half of the participants received an orig- inal and the other half a model version. The ver- • The percentages of information not located, incor- sions were swapped in the second round so that, rectly comprehended information, the time need- at the end of this study, every person had tested ed to locate the information, and the personal an original and the corresponding model pack- Relationships between participants’ demo- Percentages for information not found and graphic background and their opinions relating incorrectly comprehended information were to the package inserts were calculated using the determined for the total of tested key messages Cramer V calculation procedure from the SPSS of each package insert. Furthermore, the medi- 14.0 statistics program for both package insert an of the time required to locate all 15 answers groups—originals and models. In addition, for testing differences, Pearson’s chi-square test In the section that addressed the partici- was used. Afterward, Pearson’s correlation coef- pants’ personal opinions of the package inserts, ficients were also calculated for each package each participant used a five-point scale to as- sess the comprehensibility, legibility, complexityof information, clarity and structure, and their confidence in the described medicine. The par- In the first round, 1,105 people participated, ticipants’ answers were coded as follows: and of these 1,057 took part in the second test “yes” = 1, “mostly yes” = 2, “other” = 3, “mostly round (age 10–92 years; average 38 years) (4).
no” = 4, and “no” = 5. Medians were calculated Figures 1 and 2 illustrate that the participants for each of the 17 questions relating to the per- located the information in each model package insert significantly better than in the corre- Afterward, Pearson’s correlation coefficients sponding original version. There was a very high were calculated using the SPSS 14.0 statistics correlation between locatability in the original package inserts and the amount of text they • The package insert specifics, such as the amount of contained. An increase in the number of words in the originals led to a significant decrease in Relationship between theamount of text in package inserts and the percentageof not located informationfrom all 15 tested key mes- model; 1, Enalapril; 2,Ibuprofen; 3, Paracetamol Percentage of Tested Information Not Located Relationship between theamount of text in package inserts and the time neededto locate all 15 tested keymessages (17). Definitions Time Required to Locate Tested Information (min) ability to locate the contents (P = 0.019), with find the contents. However, these six relation- patients requiring significantly more time to ships were not significant, although their corre- find the information (P = 0.006).
lation coefficients were between 0.509 and A high correlation was also found in the mod- el versions between locatability and the amount Furthermore, significantly more participants of text. However, this was not significant.
stated that the information was easy to locate if: Furthermore, there was no general relation- • They needed less time to locate the requested in- ship between the comprehensibility and the amount of text; long texts, such as from the • They found more of the 15 tested key messages ibuprofen original, were also well comprehend- In addition, relationships in the group of the With the exception of the amount of text, the originals were found as follows. The locatability correlation coefficients concerning the com- of the tested key messages decreased in relation prehensibility of the 15 tested key messages were always less than 0.5. Significant correla-tions between comprehensibility and partici- • Nonquantifiable phrases per total number of pant opinions were not found in either package words. Words such as “longtime use” are nonquan- tifiable phrases. They do not enable the reader to Further significant relationships were found clearly rate the importance of the information be-ing communicated. For example, “longtime use” concerning the amount of text to the following could be interpreted as either a period lasting at 8 of the 17 participant opinions about the least 1 month or a period lasting up to 1 year or package inserts. An increase in the number of • Difficult words per total number of words.
• First impression of the originals deterred the par- • Sentences longer than 20 words per total number ticipants from reading further (originals: P = 0.021; These findings were valid for the percentage of • Confidence about using the medicine decreased located information and for the time needed to (models: P = 0.022; Figure 5).
Relationship between theamount of text in package inserts and the percentageof incorrectly compre-hended information from Percentage of Incorrectly Comprehended Information Relationship between theamount of text in package inserts and the motivationto read the package insert(17). Definitions as in • Participants felt worse informed by the informa- much information (originals: P = 0.033, models: tion contained in the package inserts (originals: • Information provided in the package insert was • Participants more frequently did not want similar more frequently difficult to locate (originals: P = package inserts in future (originals: P = 0.015).
• Participants more frequently expressed the • Information provided was difficult to understand opinion that the package inserts contained too Relationship between theamount of text in packageinserts and the confidence to use the medicine afterreading the package insert.
Definitions as in Figure 1.
Confidence in the Medicine After Reading the Package Insert • Participants more frequently stated the text was The following further relationships concern- difficult to read (originals: P = 0.032).
ing the participants’ opinions were detected inthe group of the originals: A similar high number in significant relation-ships of participants’ opinions about the origi- • An increase of the average number of words per nals was found in the percentage of difficult sentence reduced confidence in using the medi- words per total number of words. Terms were as- • The higher the percentage of sentences with sub- sessed as potentially difficult for patients based junctive tenses, the more people stated, after read- on both the experience of the study leader and ing the package insert, that the first impression put their occurrence in medical dictionaries. The them off reading the information (P = 0.005), the higher the percentage of medical terms, the package insert contained too much information (P = 0.049), and the important information was notprovided at the beginning (P = 0.01).
• They lost confidence in using the medicine (P = • The increase in the percentage of words in brackets per total number of words led to more patients • The package insert did not explain all important feeling worse informed about the medicine (P = 0.038); more people stated that the package insert • The package insert contained too much informa- contained too much information (P = 0.019); fewer participants wanted the package insert in the fu- • The package insert was difficult to understand (P = ture (P = 0.02); and the information provided was more difficult to locate (P = 0.043) and to read (P = • Complicated sentences were contained (P = 0.017).
• Difficult words were in the package insert (P = • The higher the percentage of words longer than 20 letters, the more participants stated that the text • The information provided was not precise (P = was difficult to understand (P = 0.042).
• At the beginning of the package insert there was Significant influences of demographic data less important information (P = 0.042).
on the participants’ opinions were often found (P ≤ 0.005). However, the Cramer V was never in the year 2000 found an average text amount over 0.195, which showed that only very weak of 1,496 words (12). The unpublished PAINT2 relationships exist. The most significant de- survey of 271 package inserts, randomly select- pendencies within the 17 investigated opinions ed from all versions available on the German were found in the aspect of age (originals 15×, medicine market in 2005, showed a significant models 14×) followed by education level (origi- text increase over 5 years to an average of 2,004 nals 14×, models 11×), sex (originals 5×, models words per leaflet. Further, rises in the amount of 12×), participant’s mood (originals 8×, models text are expected, for example, due to the new 10×), and finally, the medicine use (originals 2×, demands of Directive 2004/27/EC (1), the text increase of the QRD-template (a text frame for package inserts in the European Union) (14) showed that an increase in age led to higher and continuing practical experiences with the motivation to read the information. This was significant in 9 of the 10 investigated package inserts (P ≤ 0.029). However, the older the par- FDA, to provide a summary of the most impor- ticipants, the more frequently the opinion was tant contents at the beginning of package in- stated for every original that too much informa- serts would cause a further increase in the vol- tion was contained (P ≤ 0.029).
ume of text (15). However, the results of the fivemodel package inserts showed that shorter leaf- lets without this summary are appropriate to in- Apart from the number of difficult words con- form patients about the medicines, so the FDA tained in package inserts, the amount of text is also a very important key factor in the use of pean package inserts. It follows that the current package inserts. Even if extensive package in- serts are not generally less comprehensible, 2001/83/EC to require a summary of the essen- they significantly reduce the possibilities of tial package insert information (16) similar to locating the information and patients more the FDA requirements must therefore be as- sessed as inappropriate. In addition, no evi- shorter versions. In addition, long texts de- dence-based research is available to suggest crease the motivation to read the provided that such a summary, as used in the United contents and in the end only a few people will States to inform health professionals, is helpful for patients when included in European pack- Furthermore, patients who have less trust in age inserts. This is particularly the case as other their medicines caused by extensive texts will layout and design aspects, such as bold print, more frequently not comply with the instruc- were successfully used in each of the five mod- tions. In the worst case they will not use the pre- els to emphasize the most important informa- scribed medicines. More significant influences tion. Apart from the negative effects of increas- of the amount of text on the use of package in- ing the amount of text, a summary could more serts as described in the results are anticipated frequently lead to patients not reading the en- as only two different leaflet types, each with five versions, were investigated in this survey.
The serious negative effects of extensive pack- In a readability test study with 40 people, age inserts should be more considered in future Dickinson et al. (13) also found problems as a approaches. Therefore, shortening package in- result of long package inserts. Therefore some serts is an important aspect to consider. A first of the participants recommended shortening of step would be to undertake measures that we can immediately put into practice before or An investigation of 68 German package in- during the package insert user tests. Examples serts from frequently used medicines selected • Avoiding repetitions and extensive explanations In addition, the guidelines, templates, and di- • Using short points instead of long sentences rectives should focus more on the essential as- • Reducing the text that is intended only for doctors pects. Here there are parallels to package in-serts as precise, comprehensible, concise, and G U I D E L I N E S , T E M P L A T E S , A N D realistic rules can be better put into practice. If D I R E C T I V E I M P R O V E M E N T S more people understand our guidelines and The second step is related to ongoing guideline, templates, this will help us to move forward.
template, and directive optimizations, as user Guidelines, templates, and directives should tests alone cannot always reduce the text to the more closely reflect scientific and practical ex- optimal amount of fewer than 1,000 words or a perience. Data relating to the amount of text were already provided above. However, apply- One possibility for optimization concerns the ing the more frequent research results in these existing Quality Review of Documents (QRD) documents will avoid unrealistic recommenda- template, which contains over 500 words (14).
tions such as the recent proposal for amending The results of this survey with the five model package inserts, containing a template text of amended text in package inserts shall, for a pe- around 200 words, indicate that shorter tem- riod of 1 year, be presented in bold, with an ex- plates are sufficient and lead to significantly tra symbol and the words “New information” better results in the locatability of information.
(16). Reasons for the impracticability of this Furthermore, consistency between different guidelines, templates, and directives would behelpful. However, some documents lack this • Many medicines have a shelf life of up to 5 years.
Patients receiving medicines shortly before the consistency, with the result that less appropri- shelf life ends would have information presented as ate texts are used in package inserts. For exam- new that would in fact be old. In addition, patients ple, the QRD template contains long sentences, could be confused when text is emphasized in one repetitions, and abbreviations while both the package insert as new, not emphasized in the next old and new readability guidelines recommend package insert, then in a later package insert again A further aspect that should be reconsidered • The use of bold print is very appropriate to empha- is the requirement to include less important in- size the most important information. But too fre- formation. Since the implementation of Direc- quent use of bold print decreases the effect of em- medicine is sold in the different European • The suggested presentation of new information Union member states have to be provided in will need an annual update of all package inserts package inserts (1). However, this information delivered in the more than 100,000 medicinesavailable in the European Union. This could lead to is less important for patients (11). Should it be a collapse of the agencies that have to approve required by individuals, every pharmacist or each amendment and increase the costs of the doctor, and especially the manufacturer, should be able to provide this information upon re-quest.
guidelines, templates, and directives should be pharmaceutical company addresses, particular- provided in fewer documents and preferably on ly in centralized approved package inserts (14).
one central website so that every person has No patient sees the need for almost 30 different easy access to the most up-to-date documents.
addresses of the same company. Therefore, re- The Regulatory and Procedural Guidance web- ducing the number of addresses would be an- site of the EMEA (20) is, in this case, a good op- other good opportunity to shorten and opti- portunity for everything that should be consid- ered in package inserts and user tests to be provided in one place. If guidelines were pre- sented in fewer documents, it would avoid con- fusion. For example, rules relating to user test- Another aspect is the right time to carry out ing are provided in the European Commission user tests. The current situation is that these document from 2006 (2), in the readability tests are done before the approval procedure guideline (19), in national guidelines, and in starts. However, this can result in text changes other sources. As a result, discussions were after the user tests. For example, difficult words and extensive paragraphs may reoccur in the and agencies about the actual requirements. A successfully tested version. One reason is the similar situation exists in the recommendations approval procedure. A suggestion is, therefore, to write easily readable and comprehensible that the package insert texts are first optimized and in a second step the pharmaceutical com-panies submit these texts to the agencies for ap- U S E R T E S T S U C C E S S C R I T E R I A proval. Afterward, the first test round with pa-tients can start. This recommendation is based Realistic and appropriate success criteria are on practical experience, as PAINT-Consult has also required for the user tests. The current suc- used it together with some companies in the cess criterion in the verbal face-to-face inter- past and it avoided extensive text changes and view method is the 90/90% rule. In total, a min- text increase after the test. However, this is only imum of 80% of the participants should be able a suggestion to provoke discussion, since gener- to use each tested key message (2,3,19).
al changes such as this require the existing ap- The written readability test success criteria proval procedure to be modified, which is not so are similar. One such is that in total 80% of the participants should be able to locate and un- A further point to take into consideration is derstand each tested key message. There was a the harmonization of package inserts in Europe.
discussion relating to higher success criteria.
Professor Sless, one of the verbal interview successfully tested package insert, but the agen- method developers, did not recommend higher cies demand to use another tested version for user test success criteria than we have at the harmonization. As a result, the best and shortest moment because, in his opinion, they are unat- package insert is not always used. Therefore, tainable without falsifying the results (21). Fur- clear rules are required about when a harmo- thermore, many people have significant reading nized text should be used or not and specifying and writing difficulties. For example, 5% of the that the best and shortest version is to be used.
German adult population have these problems Otherwise we waste time and resources, and (22). Therefore, aiming higher than the current trust will be lost in the current system.
success criteria would not be an appropriateway forward.
Nevertheless, it is necessary to focus on the User test research shows that ongoing package improvement of the complete package insert insert improvements are absolutely necessary.
and not only on the tested information, for ex- These can be achieved through user testing.
ample, by counting the number of difficult However, apart from the user test success crite- words and the amount of text. For this reason, ria, testing should focus on the improvement improvements of the entire package insert are of the entire package inserts. Reducing the done as the first step of the written readability amount of text is one very important point. Es- tests before the first test round with patients tablishing the right time to carry out user tests starts. Especially in this step, the main optimiza- and clear rules for harmonization of the pack- tions are achieved and the tests with people age inserts should also be considered.
help to fine-tune the package inserts.
