prediction of binding constants of α-cyclodextrin complexes

7
Prediction of Binding Constants of r-Cyclodextrin Complexes KENNETH A. CONNORS Received April 12, 1996, from the School of Pharmacy, University of Wisconsin, 425 North Charter Street, Madison, WI 53706 . Final revised manuscript received May 17, 1996 . Accepted for publication May 23, 1996 X . Abstract 0 Proceeding from a phenomenological theory of pairwise interactions (solvent-solvent, solvent-solute, and solute-solute), the binding constant K 11 (in M -1 ) for 1:1 complex formation by R-cyclodextrin at a substrate binding site, at 25 °C in water, is given by log K 11 ) -1.74 - [Z] + 0.032(-A), where [Z] incorporates solvent-solute (solvation) and solute-solute interactions and A is the decrease in nonpolar surface area (in Å 2 molecule -1 ) on the substrate that is exposed to solvent when the binding site enters the cyclodextrin cavity. A is estimated from the structure of the binding site. Three levels of approximation are described for estimating [Z]. At the third (highest) level, the procedure when applied to 569 complex systems generated predicted values of log K 11 that agreed within 0.30 unit of the experimental values in 58% of cases, and that agreed within 1.00 unit in 95% of cases. In a recent paper 1 I defined two important unsolved problems in cyclodextrin (CyD) chemistry in the form of these questions: (1) Given the identities of a CyD host, any guest molecule, and a solvent, what will be the stability of the complex formed between the host and the guest; that is, what is the binding constant for complex formation? (2) What are the energetic sources of the complex stability? The present paper addresses the first of these questions in detail, the host being R-CyD and the solvent being water (at 25 °C), with some attention being given to the second question. At the outset we should be clear on the meaning of the word prediction in the title. This word is being used here in precisely the sense it is given by writers describing methods for the calculation of pK a values of acids and bases 2 or of partition coefficients. 3,4 Such methods are empirical in nature, though guided by physical-chemical concepts; they consist largely in the identification of patterns of behavior as reflected in group additivity schemes or linear free energy relationships, and the application of these empirical relationships to the calculation of quantities for systems of interest. This is the kind of approach taken in the present work. Of course, the literature of cyclodextrin studies includes very many correlations of complex stability with “descriptors” characteristic of the guest molecules, such descriptors includ- ing the number of carbon atoms in an alkyl chain, Hammett substituent constants, partition coefficients, and solubilities. Some of these correlations are quite good and provide a capability for predicting binding constants for additional members of the series being correlated. It is characteristic of such correlations, however, that they are severely limited in their range of application, so they do not answer the first of our two questions in any general way. More general approaches can give useful insight, but are qualitative. Thus we have found it helpful to interpret CyD complex stabilities in terms of these three postulates: Complex stability is increased by (1) an increase in binding-site electron density, (2) an increase in binding-site polarizability, and (3) a decrease in binding-site polarity. 5 A survey of the cyclodextrin literature 1 has revealed that R-CyD complex stabilities, as expressed by log K 11 (where K 11 is the equilibrium constant for 1:1 stoichiometric binding, with units M -1 ), are normally distributed, with mean value 2.11 and standard deviation 0.90 (n ) 663). Consequently virtually all R-CyD complex stabilities, as log K 11 in water at 25 °C, fall in the range -1 to +5. At the present level of experi- mental work, the typical uncertainty in any log K 11 value is of the order 0.3 unit (though a few constants are known with much greater confidence). These figures provide a setting for the goal of the present work, which is to provide calculational methods for the prediction of log K 11 (for R-CyD in water at 25 °C) for any guest molecule with an accuracy of 0.3 unit. We shall find that we do not quite achieve this goal in totality, but that we make very substantial progress toward it. Theory Our laboratory has developed a phenomenological theory of solvent effects 6 that, unexpectedly, turns out to be helpful in the present context. The free energy change of any process is considered to receive contributions from solvent-solvent interactions (giving rise to the general medium or solvophobic effect), solvent-solute interactions (the solvation effect), and solute-solute interactions (the intersolute or intrasolute effect). 6,7 One of the processes to which the theory has been applied is complex formation. 8,9 In the fully aqueous solution, our present concern, the free energy change of complex formation is given by eq 1. In eq 1, G* comp is given by eq 2, where K mf is the binding constant on the mole fraction scale, and the free energy change is on a per molecule basis; k and T have their usual meanings. G intrasol C describes substrate- ligand interaction within the complex. (In the present context, the guest molecule is the substrate and R-CyD is the host or ligand.) The G w C , G w S , G w L quantities are solvation ener- gies for the complex C, the substrate S, and the ligand L, respectively. The gAγ 1 term describes the general medium effect, γ 1 being the surface tension of water and gA being given by gA ) gA C - gA S - gA L . Each quantity A is a molecular surface area (actually the nonpolar molecular surface area), 10 and g is an empirical factor that corrects for the effect of curvature on the surface tension. In complex formation, gA is negative, and this term constitutes a driving force for complex formation; in fact, it is one way to describe the hydrophobic effect. The binding constant K mf on the mole fraction scale is related to K 11 on the molar scale by eq 3, where F is the solvent density and M* is the number of moles of solvent per kilogram of solvent: For convenience, we write Z ) G intrasol C + (G w C - G w S - X Abstract published in Advance ACS Abstracts, July 1, 1996. G* comp ) G intrasol C + (G w C - G w S - G w L ) + gAγ 1 (1) G* comp )-kT ln K mf (2) K mf )FM*K 11 (3) S0022-3549(96)00167-0 CCC: $12.00 796 / Journal of Pharmaceutical Sciences © 1996, American Chemical Society and Vol. 85, No. 8, August 1996 American Pharmaceutical Association + +

Upload: kenneth-a-connors

Post on 21-Jul-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Prediction of Binding Constants of r-Cyclodextrin Complexes

KENNETH A. CONNORS

Received April 12, 1996, from the School of Pharmacy, University of Wisconsin, 425 North Charter Street, Madison, WI 53706 . Finalrevised manuscript received May 17, 1996 . Accepted for publication May 23, 1996X.

Abstract 0 Proceeding from a phenomenological theory of pairwiseinteractions (solvent−solvent, solvent−solute, and solute−solute), thebinding constant K11 (in M-1) for 1:1 complex formation by R-cyclodextrinat a substrate binding site, at 25 °C in water, is given by log K11 )−1.74 − [Z] + 0.032(−∆A), where [Z] incorporates solvent−solute(solvation) and solute−solute interactions and ∆A is the decrease innonpolar surface area (in Å2 molecule-1) on the substrate that is exposedto solvent when the binding site enters the cyclodextrin cavity. ∆A isestimated from the structure of the binding site. Three levels ofapproximation are described for estimating [Z]. At the third (highest)level, the procedure when applied to 569 complex systems generatedpredicted values of log K11 that agreed within 0.30 unit of the experimentalvalues in 58% of cases, and that agreed within 1.00 unit in 95% of cases.

In a recent paper1 I defined two important unsolvedproblems in cyclodextrin (CyD) chemistry in the form of thesequestions: (1) Given the identities of a CyD host, any guestmolecule, and a solvent, what will be the stability of thecomplex formed between the host and the guest; that is, whatis the binding constant for complex formation? (2) What arethe energetic sources of the complex stability? The presentpaper addresses the first of these questions in detail, the hostbeing R-CyD and the solvent being water (at 25 °C), with someattention being given to the second question.At the outset we should be clear on the meaning of the word

prediction in the title. This word is being used here inprecisely the sense it is given by writers describing methodsfor the calculation of pKa values of acids and bases2 or ofpartition coefficients.3,4 Such methods are empirical in nature,though guided by physical-chemical concepts; they consistlargely in the identification of patterns of behavior as reflectedin group additivity schemes or linear free energy relationships,and the application of these empirical relationships to thecalculation of quantities for systems of interest. This is thekind of approach taken in the present work.Of course, the literature of cyclodextrin studies includes

very many correlations of complex stability with “descriptors”characteristic of the guest molecules, such descriptors includ-ing the number of carbon atoms in an alkyl chain, Hammettsubstituent constants, partition coefficients, and solubilities.Some of these correlations are quite good and provide acapability for predicting binding constants for additionalmembers of the series being correlated. It is characteristicof such correlations, however, that they are severely limitedin their range of application, so they do not answer the firstof our two questions in any general way. More generalapproaches can give useful insight, but are qualitative. Thuswe have found it helpful to interpret CyD complex stabilitiesin terms of these three postulates: Complex stability isincreased by (1) an increase in binding-site electron density,(2) an increase in binding-site polarizability, and (3) a decreasein binding-site polarity.5

A survey of the cyclodextrin literature1 has revealed thatR-CyD complex stabilities, as expressed by log K11 (where K11

is the equilibrium constant for 1:1 stoichiometric binding, withunits M-1), are normally distributed, with mean value 2.11and standard deviation 0.90 (n ) 663). Consequently virtuallyall R-CyD complex stabilities, as log K11 in water at 25 °C,fall in the range -1 to +5. At the present level of experi-mental work, the typical uncertainty in any log K11 value isof the order 0.3 unit (though a few constants are known withmuch greater confidence). These figures provide a setting forthe goal of the present work, which is to provide calculationalmethods for the prediction of log K11 (for R-CyD in water at25 °C) for any guest molecule with an accuracy of 0.3 unit.We shall find that we do not quite achieve this goal in totality,but that we make very substantial progress toward it.

TheoryOur laboratory has developed a phenomenological theory

of solvent effects6 that, unexpectedly, turns out to be helpfulin the present context. The free energy change of any processis considered to receive contributions from solvent-solventinteractions (giving rise to the general medium or solvophobiceffect), solvent-solute interactions (the solvation effect), andsolute-solute interactions (the intersolute or intrasoluteeffect).6,7 One of the processes to which the theory has beenapplied is complex formation.8,9 In the fully aqueous solution,our present concern, the free energy change of complexformation is given by eq 1.

In eq 1, ∆G*comp is given by eq 2,

where Kmf is the binding constant on the mole fraction scale,and the free energy change is on a per molecule basis; k andT have their usual meanings. ∆Gintrasol

C describes substrate-ligand interaction within the complex. (In the present context,the guest molecule is the substrate and R-CyD is the host orligand.) The ∆Gw

C, ∆GwS, ∆Gw

L quantities are solvation ener-gies for the complex C, the substrate S, and the ligand L,respectively. The ∆gAγ1 term describes the general mediumeffect, γ1 being the surface tension of water and ∆gA beinggiven by ∆gA ) gAC - gAS - gAL. Each quantity A is amolecular surface area (actually the nonpolar molecularsurface area),10 and g is an empirical factor that corrects forthe effect of curvature on the surface tension. In complexformation, ∆gA is negative, and this term constitutes a drivingforce for complex formation; in fact, it is one way to describethe hydrophobic effect.The binding constant Kmf on the mole fraction scale is

related to K11 on the molar scale by eq 3, where F is the solventdensity andM* is the number of moles of solvent per kilogramof solvent:

For convenience, we write Z ) ∆GintrasolC + (∆Gw

C - ∆GwS -X Abstract published in Advance ACS Abstracts, July 1, 1996.

∆G*comp ) ∆GintrasolC + (∆Gw

C - ∆GwS - ∆Gw

L) + ∆gAγ1 (1)

∆G*comp ) -kT ln Kmf (2)

Kmf ) FM*K11 (3)

S0022-3549(96)00167-0 CCC: $12.00796 / Journal of Pharmaceutical Sciences © 1996, American Chemical Society andVol. 85, No. 8, August 1996 American Pharmaceutical Association

+ +

∆GwL). Then combination of eqs 1-3 gives eq 4.

Equation 4 is general for noncovalent association. We nextmake the specific application to water at 25 °C by insertingthe quantities F ) 1.00, M* ) 55.55, k ) 1.38 × 10-16 ergK-1, T ) 298.15 K, and γ1 ) 71.8 erg cm-2. Equation 4 thenbecomes

where [Z] ) Z/2.3kT, and g, which is treated as a constant,has been factored out of ∆gA. The quantity g has beenindependently estimated,7,9,10 from solvent effect studies, tohave the value 0.42 ( 0.05; inserting this into eq 5 gives eq6,

which is written in this way because ∆A is negative, so -∆Ais a positive quantity (having the units Å2 molecule-1). [Z] ison the same scale as log K11. Equation 6 expresses log K11 asa function of just two quantities, namely, the change innonpolar surface area as the substrate and ligand associateto form the complex, and [Z], which incorporates solvationenergies and the substrate-ligand interaction energy. Itshould be noted, from the definitions of [Z] and Z, that thisquantity is composed of four terms, two preceded by positivesigns and two by negative signs, so that at least somecompensation of terms can be expected in [Z]. Prediction oflog K11 consists of making estimates of ∆A and [Z] for use ineq 6.

ImplementationThroughout this work use is made of hundreds of R-CyD

binding constant estimates drawn from the primary literature.All of these results, with citations to the original sources, arecompiled in a recent survey1.Identification of Binding SitessSome guest species are

small enough to be included in their entirety within the R-CyDcavitysthe chloride ion is an examplesbut most guests arelarger than the cavity. We therefore focus our attention on aportion of the guest, called the binding site, that upon complexformation penetrates into the CyD cavity. For example,4-chlorobenzoic acid is reasonably expected to possess twobinding sites, these being the chloro site and the carboxy site(both binding sites including a substantial portion of thecontiguous aromatic ring). As we see from this example, aguest may possess more than one binding site, and 1:1complex formation can take place at each binding site, withthe concurrent formation of isomeric 1:1 complexes. (In thepresent work we do not consider the subsequent possibleformation of complexes of higher stoichiometric ratios). Theobserved binding constant, K11, is the arithmetic sum of themicroscopic binding constants describing the formation of the1:1 isomeric constants.5,11Quite a bit is known about preferred binding sites in CyD

systems. Perhaps the most direct information comes fromX-ray crystallographic studies, though its applicability to thesolution state may be questionable. Spectroscopic techniques,particularly nuclear magnetic resonance, can identify the siteof binding.12,13 Guest structure-complex stability relation-ships provide insights.5,14,15 By such means we infer, forexample, that, in both 4-substituted phenols and their corre-sponding anions, the primary binding site is the 4-substituent;

on the other hand, in 4-substituted benzoic acids, the primarysite is the carboxylic acid site, the 4-substituent being asignificant but less important site. Upon ionization, the4-substituent site becomes the dominant site, with bindingat the carboxylate site being insignificant. When experimen-tal evidence is lacking, binding-site identification must pro-ceed by analogy with related compounds, or by invokingchemical ideas such as the three postulates listed in theintroduction. Molecular models can be helpful in showingwhether or not a proposed binding site can fit into the cavity.(Extracavity binding, that is, binding of the substrate to theoutside of the CyD, seems possible for some substrates, butlittle is known about this process.)At this stage of the procedure it is not necessary to make

judgments about the relative importance of two or morebinding sites within a guest molecule; the later calculationswill provide quantitative estimates. The need at this stageis simply to identify the chemically reasonable binding sites.For many substrates, exemplified by 1,4-disubstituted ben-zenes, this is easy to do, but for some structures, such asheterocyclic compounds, for which little systematic informa-tion is available, it may not be possible at this time.Estimation of ∆AsWhen a semipolar molecule goes into

aqueous solution, only the nonpolar surface area of themolecule makes a contribution (unfavorable in this case) tothe general medium (hydrophobic) effect.10 Now consider abinding site on a guest molecule undergoing inclusion in aCyD cavity (in aqueous solution); the situation is just opposite,energetically, to the solubility case. Now the nonpolar portionof the binding site surface area makes a favorable contributionto the general medium effect. Thus we are led to define -∆Aas the nonpolar surface area of the binding site that isincluded in the CyD cavity. That is, ∆A, a negative quantity,is the nonpolar area that is removed from contact with water,eq 7:

For a binding site that essentially fills the CyD cavity, analternative calculation is given by eq 8:

Obviously -∆A cannot be larger than the internal area of theCyD cavity. For use in eq 6, all areas are expressed in Å2

molecule-1.We first must define nonpolar. In a study of solvent effects

on R-CyD complex stability9 we recently inferred that thepolarity of the R-CyD cavity corresponds to log P ) -0.3,where P is the 1-octanol/water partition coefficient. For ourpresent purpose this quantity separates the polar from thenonpolar surface areas. Thus any “part” of the binding sitethat possesses a log P value more positive than -0.3 is definedto be nonpolar. The fragmental constants of Nys and Rekker16are taken as measures of log P. For example, f ) 0.70 forCH3, which therefore is nonpolar. For aliphatic COOH, f )-1.00, whereas for aromatic COOH f ) 0.00; thus the formeris polar whereas the latter is nonpolar.Next we need estimates of the areas of portions of binding

sites. These were obtained with a technique used earlier6 toobtain molecular surface areas. A sheet of aluminum foilsmoothly wrapped over a CPK molecular model is weighed,and the weight is converted to an area by means of acalibration. This simple technique gives results consistentwith elaborate computer methods. Table 1 lists values ofareas of groups obtained in this way. (Obviously there ischemical judgment involved in delineating the border between

log K11 ) -log FM* - Z2.3kT

-∆gAγ12.3kT

(4)

log K11 ) -1.74 - [Z] - 0.0758g∆A (5)

log K11 ) -1.74 - [Z] + 0.032(-∆A) (6)

-∆A ) area of nonpolar portion of binding site (7)

-∆A ) internal area of CyD cavity -area of polar portion of binding site (8)

Journal of Pharmaceutical Sciences / 797Vol. 85, No. 8, August 1996

+ +

polar and nonpolar areas; this ambiguity is present no matterwhat calculational technique is applied to find the area.)The internal area of the R-CyD cavity was estimated in

several ways. (1) By the foil technique the value 126 ( 6 Å2

molecule-1 was obtained. (2) Molecular models and X-raycrystal data13 indicate that the cavity is about 8 Å long and4-6 Å in diameter, yielding an internal area of 125 ( 25 Å2

molecule-1. (3) Solvent effect studies9 gave an experimentalestimate for -g∆A of 58 ( 10 Å2, giving (with g ) 0.42) 138( 29 Å2 molecule-1. (4) If we suppose that the mean value oflog K11 for R-CyD, 2.11 ( 0.90, occurs when [Z] ) 0, then witheq 6 we obtain 120 ( 32 Å2 molecule-1.The very reasonable agreement among all these estimates

is encouraging. In the following the value 125 Å2 is taken asthe cavity area of R-CyD; this is therefore the maximum valueof -∆A for use in eq 6. The minimum value of ∆A is 0.The value of -∆Amakes a critical contribution in eq 6, and

with the accumulation of further data and experience we canexpect our estimates of this quantity to undergo revision. Therole of polar groups may be particularly subtle. Althoughpolar portions do not contribute to -∆A as defined above, andtherefore make no energetic contribution to complex stabilityvia the general medium effect, it is quite possible for them toundergo solute-solute interactions and thus to contribute tocomplex stability via the intrasolute effect; this contributionwill be made through [Z].Isomerism and Statistical EffectssWe saw above that

it is possible for a substrate to form isomeric 1:1 complexes ifit possesses more than one binding site, and that the experi-mental K11 value is the sum of the microscopic binding siteconstants. Let Kbs

J be the microscopic binding-site constantfor complex formation at binding site J, and suppose nJ is thenumber of identical J binding sites. Then in general eq 9 canbe written

and

It is important to note that the KbsJ , not the log Kbs

J , areadditive. In the particular case of a guest molecule thatpossesses just nA identical A binding sites, eq 9 becomes K11

) nAKbsA , and eq 10 becomes

An example is a 1,4-symmetrically disubstituted benzene, forwhich nA ) 2; in this case log K11 is 0.30 unit larger than logKbsA . Systems of this type show a statistical effect, for

although complexes can form at each of the nA sites, thesecomplexes are identical. The quantity nA is a statistical factorexpressing the increased probability of complex formationcompared with a substrate that possesses a single binding site.

Now consider an n-alkane. Molecular models show thatabout six methylene groups completely fill the R-CyD cavity,so that for n-hexane we set nA ) 1 in eq 11. n-Heptane offerstwo equivalent six-atom sites, so nA ) 2, for n-octane nA ) 3,and so on.17Estimation of [Z]sThe approach that has been taken is

to make use of experimental K11 values and estimated ∆Avalues, substituting these quantities into eq 6 to calculate thecorresponding [Z]. Patterns of behavior were then soughtrelating [Z] to guest structure. Statistical effects and isomer-ism were taken into account so that the [Z] values apply tobinding sites. The outcome has been methods for the estima-tion of [Z] at three levels of approximation.Level Is[Z] values were calculated as described above for

a set of 229 systems whose guest molecules have relatively“simple” chemical structures. This set included many meta-and para-substituted aromatics, straight-chain aliphatics,monofunctional compounds, etc., but excluded most hetero-cycles and substances such as many drug molecules. A plotof log K11 against -∆A showed a trend with much dispersion.All of the points were included within an “envelope” definedby these upper and lower limits:

Equation 12 is the mean of these two lines:

Equation 12 constitutes the level I approximation to predictingR-CyD binding constants. The approach is to identify thedominant binding site on the guest, to estimate -∆A, and toapply eq 12. A statistical factor is applied if appropriate.The empirical eq 12 can be compared with the theoretical

expression, eq 6. The difference between them reflects adependence of [Z] on -∆A, this relationship being embodiedin eq 12. Thus [Z] is not explicitly estimated in the level Icalculation.Example 1: Benzoic AcidsThe binding site is the carboxy

site, which completely fills the R-CyD cavity, so -∆A ) 125,and eq 12 gives log K11 (calcd) ) 2.51. The experimental valueis 2.88.Example 2: Bromide IonsFor this very polar species -∆A

) 0, so log K11 (calcd) ) 0.68. The experimental value is 0.32.Level IIsAt this second level of approximation typical

values of [Z], obtained as described above from experimentallog K11 values, are assigned to various binding-site structures.No attempt is made, at this level, to account for subtletiessuch as substituent effects; instead all binding sites of a givenstructural class (e.g., Ar-Cl) are assigned the same [Z] value.Table 2 lists these level II [Z] values. The uncertainties givenfor these figures are standard deviations. The values of -∆Aadopted when calculating [Z] are given for some binding siteswhen the selection of ∆A may not be obvious.Example 3: 4-Iodoaniline (Neutral Form)sThe iodo-sub-

stituted end of the molecule is evidently the binding site, with-∆A ) 125. From Table 2, [Z] ) -1.0. Equation 6 gives logK11 (calc) ) 3.26. The experimental value is 3.45. For thecationic form of this substrate the level II calculation willagain give 3.26, whereas the experimental value is 3.04.Example 4: 1-Propanols -∆A ) 43 + 2(21) ) 85, and [Z]

) -0.6, giving log K11 (calcd) ) 1.58. The experimental valueis 1.36.Example 5: 1,4-Dichlorobenzenes -∆A ) 125, and [Z] )

+0.1, so eq 6 yields 2.16. However, this is a substrate with

Table 1sMolecular Group Areas for Use in Calculating −∆A

Group Area/Å2 molecule-1 Group Area/Å2 molecule-1

CH 18 F 23CH2 21 Cl 42CH3 43 Br 49C6H4 84 I 59C6H5 103 NO2 47C6H6 110 NH2 38C5H9 119 OH 28C5H10 133 COOH 56

K11 ) nAKbsA + nBKbs

B + ... (9)

log K11 ) log(nAKbsA +nBKbs

B +...) (10)

log K11 ) log nA + log KbsA (11)

log K11max ) 0.0225(-∆A) + 1.75

log K11max ) 0.0066(-∆A) - 0.40

log K11 ) 0.0146(-∆A) + 0.68 (12)

798 / Journal of Pharmaceutical SciencesVol. 85, No. 8, August 1996

+ +

two identical sites, so nA ) 2 in eq 11, and the result is logK11 (calcd) ) 2.46. The experimental value is 2.36.Level IIIsAt this level account is taken of the effects of

substituents elsewhere in the molecule on the [Z] value of thebinding site by seeking linear free energy relationshipsbetween [Z] and appropriate substituent parameters. Othertypes of correlations are also found to be useful. Table 3 liststhe level III estimating methods for [Z] that have beendeveloped in this work. These are the independent variablesthat appear in the Table 3 correlations:

Example 6: 3-NitrophenolsThe NO2 site is the binding site,with -∆A ) 125. For the conjugate acid, σm ) +0.13 for them-OH group, so [Z] ) +0.33, and log K11 (calcd) ) 1.93; theexperimental value is 2.03. For 3-nitrophenolate σm ) -0.47,giving [Z] ) -0.38 and log K11 (calcd) ) 2.64; the experimentalresult is 2.57.Example 7: 3-Methyl-2-butanolsA CPK model shows that

the alkyl portion can be included in the R-CyD cavity, giving-∆A ) 125. σ* ) 0.00 for CH3, -0.19 for CH(CH3)CH3, and

+0.49 for H, so ∑σ* ) +0.30. From Table 3, [Z] ) +0.92, andwith eq 6 we find logK11 (calcd) ) 1.34; the experimental valueis 1.27.Example 8: Iodide Ions -∆A ) 0, RD ) 17.53, and pKa )

-9, so [Z] ) -2.93 and log K11 (calcd) ) 1.19; experimentally1.17 is found.Example 9: Dodecylammonium IonsThe binding site (CH2)6-

NH3+ has -∆A ) 125; the premise is that the maximumavailable nonpolar area is inserted into the R-CyD cavity, andthe functional group interacts with the rim of the torus in anextracavity manner. From the CH3(CH2)nX model in Table3, [Z] ) -1.17, so we calculate logK11 (calcd) ) 3.40. However,there are also five alkane binding sites, each with -∆A ) 125and [Z] ) +1.03, so each provides a microscopic bindingconstant of 17 M-1 (from eq 6). Using eq 9 yields log K11(calcd) ) 3.41; the alkane contribution is negligible. Theexperimental value is 3.36.

Table 2sBinding Site [Z] Values for Level II Calculations

Binding Site or Substrate TypeNo. of

Examples −∆Aa [Z]

Ar−COOH (m, p) 20 125 −0.5 ± 0.4Ar−COOH (o) 8 +0.7 ± 0.5Ar−F (m, p) 5 +0.9 ± 0.3Ar−Cl (m, p) 13 +0.1 ± 0.3Ar−Br (m, p) 7 −0.6 ± 0.3Ar−I (m, p) 6 −1.0 ± 0.3Ar−OCH3 (m, p) 8 +0.7 ± 0.8Ar−CH3 (m, p) 18 +0.7 ± 0.3Ar−OH (m, p) 4 97 +0.3 ± 0.3Ar−NH2 (m, p) 4 87 +0.4 ± 0.3Ar−H 8 +0.2 ± 0.3Ar−CN (m, p) 9 +0.3 ± 0.3Ar−NO2 (m, p) 15 +0.1 ± 0.3Ar−COOCH3 (C2H5) 7 −0.1 ± 0.5Ar−CHdH−COOH 17 125 −1.3 ± 0.5Ar−CHdCH− 15 +0.3 ± 0.4CH3(CH2)nCH3 5 +1.0 ± 0.2CH3(CH2)nX 33 −0.6 ± 0.4[X ) COO-, SO3

-, SO4-, NH3

+, OH,NMe3+, NHCONH2, COOR]

CH3(CH2)nCOOH 11 −1.1 ± 0.4CF3(CF2)nCOOH 4 69 −1.0 ± 0.1Small, highly polar R−COOH 5 0 −2.4 ± 0.4Branched alcohols 9 +0.8 ± 0.4Inorganic anions 9 0 −2.5 ± 0.7Phenylazobenzenes 63 −1.2 ± 0.5Naphthylazobenzenes 10 −0.2 ± 0.2CHnCl4-n 3 +0.5 ± 0.4Cycloalkanes 4 +0.7 ± 0.4Acetamides 3 −0.8 ± 0.5Acetates 5 −1.2 ± 0.3C6H5CH2− 8 +1.1 ± 0.4Sugars 5 42 −2.0 ± 0.2Cyclohexenones, cyclohexadienones

(including steroid A ring)17 97 −0.9 ± 0.3

a Units are Å2 molecule-1.

σ Hammett substituent constant; use σm or σp as appropriate.For the benzodiazepine moiety use σ ) +1.7.

σ* Taft polar substituent constant. The sum ∑σ* refers to groupsattached to the carbinol carbon, including hydrogen.

fR Nys−Rekker fragmental partition constant.16 ∑fR is for thealkyl group R in RC6H5.

RD Ionic refraction18 at the sodium D line.pKa pKa value of conjugate acids of inorganic anions.2,19n Number of CH2 groups.nC Number of C atoms.

Table 3sBinding Site [Z] Values for Level III Calculations

Binding Site or Substrate Type [Z] r a

Ar−COOH (m, p) +0.39σ − 0.62 0.65Ar−F (m, p) +0.9σ + 1.18Ar−Cl (m, p) +0.52σ + 0.08 0.72Ar−Br (m, p) +0.54σ − 0.38 0.83Ar−I (m, p) +0.40σ − 0.95 0.78Ar−OCH3 (m, p) +2.1σ + 1.36Ar−CH3 (m, p) +0.46σ + 0.90 0.50Ar−NH2; Ar−OH (m, p) +1.01σ + 0.34 0.72Ar−H +0.69σ + 0.33 0.78Ar−CN (m, p) +1.17σ + 0.35 0.96Ar−NO2 (m, p) +1.19σ + 0.18 0.95Ar−COOCH3 (m, p) +0.99σ + 0.01 0.68Ar−C(CH3)3 +0.65σ + 0.68 0.95Ar−X (o) +0.43[Z]p + 0.43 0.71Ar−CHdCH−COOH (m, p) +0.70σ − 1.28 0.62Ar−CHdCH−COOH (o) +0.68σ − 0.37 0.68Ar−CHdCH− (p) −0.29σ − 0.05 0.45Ar−CHdCH− (o, m) +0.03 (±0.25)CH3 (CH2)nCH3 +1.03 (±0.20)CH3(CH2)nX −0.10n + 0.03 0.93[X ) COO-, SO3

-, SO4-,

NH3+, NHCONH2]

CH3(CH2)nOH (n e 3) −0.34 (±0.07)CH3(CH2)nOH (n g 4) −0.47n + 1.75 1.00Alkylbenzenes −0.85∑fr + 1.35 0.77CH3(CH2)nCOOH −1.07 (±0.24)CF3(CF2)nCOOH −0.95 (±0.05)Small, highly polar R−COOH −2.37 (±0.35)Branched alcohols −0.72∑σ* + 1.14 0.64Halide ions −0.18RD − 0.18 pKa − 1.39 0.99Other inorganic anions −0.18RD + 0.13 pKa + 0.27 0.94Phenylazobenzenes −1.26 (±0.49)Naphthylazobenzenes −0.19 (±0.15)CHnCl4-n +0.70 (±0.10)Cycloalkanes −0.24nC + 2.47 0.90Acetamides −1.05 (±0.21)Acetates −1.23 (±0.09)C6H5CH2− +0.92 (±0.08)Sugars −1.96 (±0.23)Cyclohexenenones, cyclohexadienones

(including steroid A ring)−0.95 (±0.28)

HO(CH2)nOH (n e 5) +0.42 (±0.08)HO(CH2)nOH (n g 6) −0.52n + 3.78 0.99HO(CH2)n+2OH +1.17[Z]CH3(CH2)nOH + 0.83 0.99Barbituric acids −0.89 (±0.19)Barbituric acid anions −0.56 (±0.11)Barbituric acid anions +0.77[Z]acids + 0.13 0.91Cycloalkanols +0.03 (±0.43)Naphthyl (and other fused

aromatic rings)+0.73 (±0.32)

a Correlation coefficient.

Journal of Pharmaceutical Sciences / 799Vol. 85, No. 8, August 1996

+ +

Results and DiscussionLevel I PredictionsThe level I calculation, embodied in

eq 12, was applied to 624 systems, with the result that 165(26.4%) of the log K11 estimates were within 0.30 unit of theexperimental values and 479 (76.8%) were within 1.00 unit.This is quite a satisfactory outcome for such a simpleprocedure.The most frequently selected value of -∆A is 125 Å2

molecule-1, which with eq 12 generates log K11 (calcd) ) 2.51.Table 2 gives some -∆A values for guidance in the estimationof this quantity. The following comments concern whether aparticular functional group is polar or nonpolar according tothe criterion adopted earlier.It was pointed out above that aliphatic COOH is polar

whereas aromatic COOH is nonpolar. Where then does theCOOH group in a cinnamic acid lie? This can be assessed bycomparing the experimental value3 of log P for trans-cinnamicacid (2.13) with the values calculated from fragmental con-stants16 assuming first that COOH is aliphatic (1.38) and thenthat COOH is aromatic (2.38). This comparison reveals thatthe cinnamic acid COOH group has the polarity of an aromaticcarboxylic acid, and therefore that it is to be counted in thequantity -∆A, which becomes 125 for cinnamic acids (whichare thought to have the -CHdCHCOOH moiety inserted inthe cavity).20,21The polarity of the -NdN- group as it appears in phenyl-

azobenzenes and naphthylazobenzenes can be evaluated inthis way: The experimental log P for 4-(dimethylamino)-azobenzene is 4.58, whereas the sum of the fragmentalconstants, omitting only the azo group, is 3.96. The difference,+0.62, is attributed to the azo group, which therefore isnonpolar by our criterion. These azobenzene substrates are“threaded” through the R-CyD cavity,22 and completely fill it,hence -∆A ) 125. This argument is supported by ourexperimental solvent effect studies on Methyl Orange,9,23which yielded the parameter -g∆A ) 56, corresponding to-∆A ) 133; the cavity is filled with nonpolar guest substance.Level II PredictionsLevel II predictions make use of eq

6 and the [Z] estimates of Table 2. This procedure was appliedto 612 systems, leading to 281 results (45.9%) within 0.30 unitof the experimental values, and 558 (91.2%) within 1.00 unit.This constitutes a definite improvement over the level Iresults.It must be realized that this estimation process involves

the exercise of considerable chemical judgment and is there-fore subject to variation in the hands of different workers. Thekey step is the assignment of -∆A, which is not alwaysunambiguous. The next step is to estimate [Z], but of coursethe [Z] quantities listed in Table 2 were a consequence of theselected -∆A values. These values are therefore coupled. Thisis why the -∆A assignments are cited for some possiblyambiguous cases in Table 2.Level III PredictionsAt level III, calculations are made

with eq 6 and the [Z] estimates provided by Table 3. Thisprocedure was applied to 569 systems. Of these, 332 (58.3%)achieved agreement within 0.30 unit of the experimental logK11 values, and 542 (95.3%) within 1.00 unit. This is theperformance alluded to in the last sentence of the introductionas not quite meeting our goal, but very respectably approach-ing it.Figure 1 is a graphical presentation of the level III results.

(In the highly congested areas of this plot some points havehad to be omitted or slightly displaced.) Some of the seriousoutliers in Figure 1 are probably a consequence of largeexperimental errors, but others may be the result of in-adequacies in the predicted quantity, either because the wrongvalue of -∆A was employed or the wrong model for [Z] waschosen, or because the method as developed is inadequate for

the particular application. (In the following section anothersource of deviations will be identified.)The performance at level III in estimating log K11 for some

more homogeneous subsets of the population can be seen inFigures 2 (substituted benzoic acids and benzoic acid anions,including ortho,meta, and para substitutions), 3 (substitutedphenols and phenolates), and 4 (inorganic anions).The empirical correlations in Table 3 generate estimates

of [Z], but in addition they may provide insight, or at leastquestions, concerning the details of complex formation, includ-ing the forces involved. For example, the signs of the F values(slopes of the Hammett plots against σ) for aromatic bindingsites are consistent with the aforementioned postulate thatcomplex stability is enhanced by increased electron density

Figure 1sPlot of log K11 (obsd) against log K11 (calcd) at the level III. The slopeof the line is unity.

Figure 2sPlot of log K11 (obsd) against log K11 (calcd) at level III for substitutedbenzoic acids (open circles) and benzoic acid anions (filled circles). The line hasunit slope.

800 / Journal of Pharmaceutical SciencesVol. 85, No. 8, August 1996

+ +

at the binding site. (A more negative [Z] is complex stabiliz-ing; a more positive [Z] is destabilizing.) Such electronic effectsare consistent with a role for induction and dispersion forcesin the substrate-ligand interaction. The intercepts in thesecorrelations may reflect in part the polarizabilities of thebinding sites. The role of polarizability is evident in thedependence on ionic refraction in the inorganic anions.Some correlations raise questions that they do not answer,

as in the series CH3(CH2)nX, where [Z] depends upon n. It isclear that the presence of the polar group X is complexstabilizing relative to the n-alkanes, and as demonstrated byexample 9, the mode of complexation is construed to involvea maximal -∆A (hydrophobic) contribution combined with anextracavity interaction of X with the CyD. But why thenshould [Z] depend upon n; that is, what role is being playedby the portion of the alkyl chain that is not in the cavity? Thismay conceivably be an electronic effect. (As example 9 shows,the purely alkane-type isomeric binding makes a negligiblecontribution.)Another type of question is suggested by the inorganic

anions, which have K11 values in the range 1-30 M-1. Itmight therefore seem that the essential question to ask is:Why are these complexes so weak? These are very polarguests, for which it seems reasonable to set -∆A ) 0;

moreover they presumably are solvated in the unbound state,a complex-destabilizing effect. Upon calculating [Z] with eq6 using -∆A ) 0 and K11 ) 1-30, we find that [Z] rangesfrom -1.7 to -3.2, highly stabilizing values. Evidently theappropriate question is: Why are these complexes so strong?The answer must be that substrate-ligand interaction (thesolute-solute or intrasolute effect) is very favorable in thesesystems.Further development of the level III procedure probably

could lead to modest improvements in performance by refiningthe correlations in Table 3, or by adding new ones as moredata become available. (The problem of substituent effectsin the phenylazobenzenes, in particular, is a morass.) Thecorrelation coefficients in Table 3 reveal the quality of thesecorrelations, which are often crude. Nevertheless, despitesuch instances of unexplained variability, the correlationsappear to capture much of the influence that guest structurehas on the quantity [Z].Although the level III procedure is more accurate, on

average, than the lower level methods, it does not fullysupercede them because it is not as generally applicable, asis revealed by the numbers of systems to which the threecalculational levels were applied. The more detailed the levelof calculation, the more information needed to apply it, andthis information may not be available.The Next LevelsThe essence of the preceding methods

has been to separate log K11 into its hydrophobic (-∆A term)contribution and the substrate-specific ([Z] term) contribution,and then to apply empirical linear free energy relationshipsor group contributions to the [Z] component. This is preferableto applying such empirical correlations to log K11 directly,because log K11 is a composite quantity.But according to its definition, [Z] is itself a composite

quantity given by eq 13.

For convenience eq 13 is rewritten with simpler symbolismas eq 14.

This multicomponent nature of [Z] may be the cause of someof the deviations seen in the quantities and correlations ofTables 2 and 3.The next more sophisticated level of calculation probably

must take into account eq 14 and the individual contributorsto [Z]. In this equation zintra

C measures the substrate-ligandinteraction, and the other quantities are solvation contribu-tions. (C represents the complex, S the substrate, and L theligand, which is R-CyD.) Equation 14 can be usefully trans-formed in the following way. We write each of the zwcomponents as a sum of two quantities, one referring to theR-CyD cavity or to the substrate binding site (bs) included inthe cavity, and the other to the exterior of the R-CyD or tothe nonincluded portion of the substrate.

With the assumption zwL(ext) ) zw

C(ext), combination of eqs 14-17 gives eq 18.

Figure 3sPlot of log K11 (obsd) against log K11 (calcd) at level III for substitutedphenols (open circles) and phenolates (filled circles). The line has unit slope.

Figure 4sPlot of log K11 (obsd) against log K11 (calcd) at level III for inorganicanions. The line has unit slope.

[Z] )∆Gintrasol

C

2.3kT+

∆GwC

2.3kT-

∆GwS

2.3kT-

∆GwL

2.3kT(13)

[Z] ) zintraC + zw

C - zwS - zw

L (14)

zwL ) zw

L(ext) + zwL(cav) (15)

zwS ) zw

S(nonincl) + zwS(bs) (16)

zwC ) zw

C(ext) + zwS(nonincl) (17)

Journal of Pharmaceutical Sciences / 801Vol. 85, No. 8, August 1996

+ +

This transformation has changed [Z] from its dependence onfour quantities (eq 14) to a function of three quantities (eq18). It is important to note, however, that whereas eq 14 isexact (within the context of this theory), eq 18 is approximate.For example, eq 14 admits the possibility that solvation mightbe net complex stabilizing, perhaps through the creation ofsolvation modes in C that do not exist in S or L; eq 18, on theother hand, says that solvation can only be destabilizing. (Wehere are supposing that the individual z quantities, which arethemselves free energy changes for spontaneous processes, canonly be negative.) Nevertheless, eq 18 expresses the essentialenergetic influences on [Z] in most systems, namely, that thecomplex is stabilized by substrate-ligand interaction and isdestabilized by solvation of the substrate and of the ligand.The important solvation interactions are at the substratebinding site and the CyD cavity, both of which must undergodesolvation in order for the complex to form.Consider the application of eq 18 to a saturated hydro-

carbon, in particular to the n-alkanes. As potential bindingsites these should be but weakly hydrated; moreover theirinteraction with R-CyD within the complex should also beweak. It therefore seems reasonable to set zintra

C - zwS(bs) ≈ 0

for these substances, giving [Z] ) -zwL(cav). From Table 3, [Z]

) +1.03 (( 0.20) for n-alkanes, giving us zwL(cav) ) -1.03 as a

provisional estimate. This applies to all R-CyD complexes,and it means that every R-CyD complex is destabilized to theextent of 1.0 log K11 unit by solvation of the R-CyD (in water).The other quantities in eq 18 are substrate specific. It may

be useful to observe that values of [Z] calculated fromexperimental log K11 values range from about -3.5 to +1.8.Combining these limits with eq 18 and the value of zw

L(cav)

shows that the limits of the difference zintraC - zw

S(bs) are about-4.5 and +0.8. We conclude that the individual z quantitieswill, by and large, probably be comparable in magnitude tothe [Z] composed of them. Indeed, this may be why so manyreasonable linear free energy relationships of [Z] can bewritten; if the individual z were orders larger than [Z], muchless regularity of behavior might have been seen. Theargument from limits may perhaps be sharpened. Thus wehad earlier5 concluded, on the basis of structural studies, thatthe maximum value for the logarithm of a binding siteconstant at an aromatic binding site is 3.72. Since -∆A )125 for most aromatic sites, this estimate yields [Z] ) -1.46,

or zintraC - zw

S(bs) ) -2.49 as a limit for such a system. If thislimit implies that zw

S(bs) ) 0, then -2.49 is the maximumstabilizing value of zintra

C at an aromatic binding site. (Thispostulated limit is closely approached by 4-iodophenolate.)Clearly this next step of decomposing [Z] into its constituentparts will be difficult, but it should be rewarding to haveavailable these separate energetic components of the com-plexation process.

References and Notes1. Connors, K. A. J. Pharm. Sci. 1995, 84, 843.2. Perrin, D. D.; Dempsey, B.; Serjeant, E. P. pKa Prediction for

Organic Acids and Bases; Chapman and Hall: London, 1981.3. Leo, A.; Hansch, C.; Elkins, D. Chem. Rev. 1971, 71, 525.4. Dunn, W. J.; Grigoras, S.; Johansson, E. In Partition Coef-

ficient: Determination and Estimation; Dunn, W. J., Block, J.H., Pearlman, R. S., Eds.; Pergamon: New York, 1986; p 21.

5. Connors, K. A.; Pendergast, D. D. J. Am. Chem. Soc. 1984, 106,7607.

6. Khossravi, D.; Connors, K. A. J. Pharm. Sci. 1992, 81, 371.7. LePree, J. M.; Mulski, M. J.; Connors, K. A. J. Chem. Soc.,

Perkin Trans. 2 1994, 1491.8. Connors, K. A.; Khossravi, D. J. Solution Chem. 1993, 22, 677.9. Mulski, M. M.; Connors, K. A. Supramol. Chem. 1995, 4, 271.10. Khossravi, D.; Connors, K. A. J. Pharm. Sci. 1993, 82, 817.11. Connors, K. A. Binding Constants: the Measurement of Molec-

ular Complex Stability; Wiley: New York, 1987; p 86.12. Bender, M. L.; Komiyama, M. Cyclodextrin Chemistry;

Springer-Verlag: Berlin, 1978.13. Szejtli, J. Cyclodextrins and Their Inclusion Complexes; Aka-

demiai Kiado: Budapest, 1982.14. Connors, K. A.; Lin, S.-F.; Wong, A. B. J. Pharm. Sci. 1982, 71,

217.15. Lin, S.-F.; Connors, K. A. J. Pharm. Sci. 1983, 72, 1333.16. Nys, G. G.; Rekker, R. F. Eur. J. Med. Chem. 1974, 9, 361.17. This argument contains two ambiguities. First, n-hexane pos-

sesses two identical ends, so it is possible that nA should be setat 2 to reflect this fact. Second, in n-octane, there are actuallytwo identical sites consisting of CH3(CH2)5 and one slightlydifferent (CH2)6 site, but, with the available data, to make thisdistinction is not possible, so these are combined and treatedas identical.

18. Le Fevre, R. J. W. Adv. Phys. Org. Chem. 1965, 3, 1.19. Bell, R. P. The Proton in Chemistry; Cornell University Press:

Ithaca, 1959.20. Connors, K. A.; Rosanske, T. W. J. Pharm. Sci. 1980, 69, 173.21. Rosanske, T. W.; Connors, K. A. J. Pharm. Sci. 1980, 69, 564.22. Cramer, F.; Saenger, W.; Spatz, H.-Ch. J. Am. Chem. Soc. 1967,

89, 14.23. Connors, K. A.; Mulski, M. J.; Paulson, A. J. Org. Chem. 1992,

57, 1794.

JS960167J

[Z] ) zintraC - zw

S(bs) - zwL(cav) (18)

802 / Journal of Pharmaceutical SciencesVol. 85, No. 8, August 1996

+ +