o )057, . 4 xml - 東京外国語大学 tokyo university of … sketch 1=]j2z da f£ 9)a8c kw4p hgon...
TRANSCRIPT
O��)057,��.��4� xml&!� ��
xml � ?<�GJ����&!
Rw3schools.com(http://www.w3schools.com/xml/xml_tree.asp)��S
(i)1)'7ML=&!N
<xml>
<doc Level=”��” Title=”*�%+;”>
<text Types=”��”>
(ii) .)&7MD8&!N
</text>
</doc>
</xml>
P�%")(�6$6���)057,
(i) CE�#7-%�9@�
���� 3)
(ii) #7-%�;:�A�FI�H>
Q�.��4��)057,
(i)
�2�/23��.��4�KB�Next� 3) ��)057,
3)
(ii)
(iii) ���.��4��)057,D���#7-%� compile �
3)
3)
!
Simple'query lemma '
Lemma '
Phrase '
Word '
Character '
'
!
Context'
''''''Lemma'Filter lemma '
'PoS'Filter '
'
all any 1none '
'
Text'types '
'
'
!
Sort'
'''''''Left L1 '
''Right R1 '
''Node '
''''''References '
'''' L1'' node' R1 '
Sample '
'
Filter'
'''' positive negative'
Frequency'
''''''Frequency '
'
''Node'tags '
''Node'forms '
''Doc'IDs '
''''''Text'Types '
'
P/N'(Positive/Negative) '
'
Collocation '
''Attribute word/tag/lempos/lemma… '
''Range '
''Minimum'Frequency'in'corpus '
Minimum'Frequency'in'given'range '
'
P/N'(Positive/Negative) '
'
Visualize '
'
1
Sketch Engine
E
Concordance -CQL -
CQL(= Corpus Query Language) r w c v tk
CQL 1990 w University of Stuttgartx IMS
� w s|{o h ” 3 �
> x D WW N WJ1a P JaE
b WW N WJ w S I*PJQQ *W L*PJQTSV yg
b P J w o g } yg
W R p x
D S I1aW RaE
!DW E RO!
W R p x
DPJQQ 1aW RaE
W R x p x
DPJQQ 1aW Ra" W L1a ) aE { y DPJQTSV1aW R(RaE
b x w※osy > xpm x LVJW V QQ
b� ) x
2
b� * & * { y *
W L1aa r x v
!HSR VJ) ! DW L1! ! W L1!==!E
!HSR VJ) ! DW L1! !E DW L1!==!E
!HSR VJ) ! DW L1! ==!E
W R & & x
DPJQQ 1aW Ra"W L1aB) aEDW L1a ) aE], - DW L1a= =aE
bD EuD Ex wy frs vls SO
b] t ], - y i ugh
nx t > os| u
t
3
� v �
JPT WS ISa V) JPT ISax “ c p
DPJQQ 1! JPT!"W L1!B) !ED S I1!WS!E DW L1!B) !E
b x y B3 J cB5 IS cB J cBB PJ[NH P J V
& J & (JI t x
DW L1a ) aEDPJQQ 1a JaEDW L1aB) a" S I1a) JIaE
PSSO* NRL & T*IS R x
DPJQQ 1aPSSO NRLa"W L1aB) aEDW L 1aB) aE]+ / a TIS Ra
< x x x
DW L1! ) !E DW L1! ) !E ! RI S ! DW L1! ) !E
� NW NR r �
o
D S I1aHSR V) aEDW L 1aB) aE D S I1a aE NW NR0V*2
t { c t x wf p sx p
DW L1a ) aE& NW NRDW L1aB3) aEDE DW L1aB3) aE
wursnvg x
b�> J NW NR > J ugh tf
b� SRW NRNRL f H ) OJWH RLNRJ
-+,+ d ./ CJ
OJWH RLNRJ x u e
x�>G� >Gfo 2015
Sketch Engine\�FBI6K
{s:Uq��
3ªWord List
;PC=§<D;PC=¨`�1�t�5za"�3
:J?:�3#�
�'�¢�a!�+�
R�1£&
�Subcorpus: <D;PC=5�yO��&[w"�+�
�Search attribute: word, lemma, tag (POS)%$��*+�
use n-grams"( n�'� '�t�5[w"�+�
��+""���!J=@5[w"�+� +�WS' options5\��#."�+�
��"�;PC=5 BNC�<D;PC=5Written_Medium_Book�Search attribute5
lemma&�! word list5[w�3#WS'0�&%2+�
§¤r�¥�£& lemma5T*��t�¨
�Filter options�
�Regular expression; ����"��"�+� ” .* ”�M7KA8PA§Z�Z~m^�
!. OK¨5��'"�”th.*”"���3# the, that, this�'�t��[w�4+�
§�'V�©�«�¦%$��2+�¨
�Minimum frequency: �p¤r5|n"�+�
�Maximum frequency: �k¤r5|n"�+�
�Whitelist: �t�&h-���n'b�J=@��3jf�6?ELPA"�+�
�Blacklist: �t�&h-��%��n'b�J=@��3jf�6?ELPA"�+�
�Include non-words: d��/�e%$5h-��#�&\�+�
Output options Frequency figures: Hit counts = raw frequency
Document counts ARF (Average Reduced Frequency)
e.g. Output type: Simple Keywords
Reference (sub)corpus
Prefer: rare/common words
Change output attribute(s):
BNC search attribute pos minimum frequency 0
�BNC'U'<D;PC=Written_Domain_Imaginative" search attribute5 lemma�
regular expression5 wh.*�minimum frequency5 1�maximum frequency5 0
�BNC'U'<D;PC=Written_Domain_Informative" search attribute5 word�
regular expression5 .*ing�Frequency figures5 Document counts
�BNC_Y" search attribute5 word�use n-grams" n=4
BNC'U'<D;PC=Written_Medium_Book" search attribute5 word�Output
type 5 Keywords & � ! Reference subcorpus 5 BNC ' < D ; P C =
Written_Medium_To-be-spoken
BNC_Y" search attribute5 word�Regular expressions5.*ing�Output type5
Change output attributes&�! lemma, pos, lempos5�y
��'Q �#vl��u word&��2
SketchEngine 4. Word Sketch 2
4
Word Sketch
Word Sketch 1. 2. Lemma Part of speech 3. Show Word Sketch (Advanced options ) 4. ( 1: ):
1. 2. 3. (
)
1: make
4. ( 1: ): Change options Word Sketch Cluster Sort by freq/ Sort by score Hide gramrels More Data 1 column Less Data 1 column
Word Sketch
make 1. British National Corpus 2. Lemma ”make” Part of speech ”verb” 3. 1 4. “np adj comp” (50.90) ( 2 )
make ”make+O+C”
2: make
* “Less Data”
1
g Sketch Engine
E
Thesaurus/sketch-Diff
2 S
�� 1 c BNC ���
"love” 2POS
2
2 love 2
W 3 2
W W 2 W
3
2 desire
Word sketch e 2Thesaurus 2 k
Sketch-Diff W 3
2 love desire c b 3
! 2 and/or 2 2
love desire
WE
3
! love 2 desire
love satisfied
c hn c
i t 0 5
SketchEngine mx 7. WebBootCat
)6Q\ 2gr
VT�W� 5!;4?�]f 4`
YXI_
�WebBootCaT ���
"?*@2-/ &=@<��.%(/ zMP���'@3( Db��ky�
�WebBootCaT �UheD (S 1 On )
1. Homep��^H�819@��WebBootCaT &;-&�
2. '@3(R�~� �[�
3. Input type� Seed words/URLs����� �c(*aN����}�)�
S 1: WebBootCaT �Uhp�
�WebBootCaT �KCq�EoF (Seed words/URLs)
�” Seed words”�Web corpus Db���
1. B�� 1.~2. v��Input type� Seed words�,$-& J���
2. Seed words�l�%@>@0(3~20G) JL�
3. Seed words 3� �:?+7�u�Q����� "?*@2-/jt���wi
�{s��(S 2On)�
4. Next &;-&�zMq�.%(/ +#?=@0�
5. OK &;-&�'@3(�Zb�
S 2: URL �A|
�”URLs”�Web corpus Db���
1. B�� 1.~2. v��Input type� URLs�,$-& J���
2. URLs�l�d[�� URL JL(S 3On)�
3. Next�OK�'@3(�Zb(S 4On)�
S 3: URL �d[
S 4: Zb��'@3(