forging new frontiers: fuzzy pioneers i

461
Forging New Frontiers: Fuzzy Pioneers I

Upload: others

Post on 11-Sep-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forging New Frontiers: Fuzzy Pioneers I

Forging New Frontiers: Fuzzy Pioneers I

Page 2: Forging New Frontiers: Fuzzy Pioneers I

Studies in Fuzziness and Soft Computing, Volume 217

Editor-in-chiefProf. Janusz KacprzykSystems Research InstitutePolish Academy of Sciencesul. Newelska 601-447 WarsawPolandE-mail: [email protected]

Further volumes of this series can befound on our homepage: springer.com

Vol. 202. Patrick Doherty, WitoldŠukaszewicz, Andrzej Skowron,Andrzej SzalasKnowledge Representation Techniques: ARough Set Approach, 2006ISBN 978-3-540-33518-4

Vol. 203. Gloria Bordogna, Giuseppe Psaila(Eds.)Flexible Databases Supporting Imprecisionand Uncertainty, 2006ISBN 978-3-540-33288-6

Vol. 204. Zongmin Ma (Ed.)Soft Computing in Ontologies and SemanticWeb, 2006ISBN 978-3-540-33472-9 Vol. 205. Mika

Sato-Ilic, Lakhmi C. JainInnovations in Fuzzy Clustering, 2006ISBN 978-3-540-34356-1

Vol. 206. A. Sengupta (Ed.)Chaos, Nonlinearity, Complexity, 2006ISBN 978-3-540-31756-2

Vol. 207. Isabelle Guyon, Steve Gunn,Masoud Nikravesh, Lot A. Zadeh (Eds.)Feature Extraction, 2006ISBN 978-3-540-35487-1

Vol. 208. Oscar Castillo, Patricia Melin,Janusz Kacprzyk, Witold Pedrycz (Eds.)Hybrid Intelligent Systems, 2007ISBN 978-3-540-37419-0

Vol. 209. Alexander Mehler,Reinhard Köhler

Aspects of Automatic Text Analysis, 2007ISBN 978-3-540-37520-3

Vol. 210. Mike Nachtegael, DietrichVan der Weken, Etienne E. Kerre,Wilfried Philips (Eds.)Soft Computing in Image Processing, 2007ISBN 978-3-540-38232-4

Vol. 211. Alexander GegovComplexity Management in FuzzySystems, 2007ISBN 978-3-540-38883-8

Vol. 212. Elisabeth Rakus-AnderssonFuzzy and Rough Techniques in MedicalDiagnosis and Medication, 2007ISBN 978-3-540-49707-3

Vol. 213. Peter Lucas, José A. Gámez,Antonio Salmerón (Eds.)Advances in Probabilistic GraphicalModels, 2007ISBN 978-3-540-68994-2

Vol. 214. Irina GeorgescuFuzzy Choice Functions, 2007ISBN 978-3-540-68997-3

Vol. 215. Paul P. Wang, Da Ruan,Etienne E. Kerre (Eds.)Fuzzy Logic, 2007ISBN 978-3-540-71257-2

Vol. 216. Rudolf SeisingThe Fuzzi cation of Systems, 2007ISBN 978-3-540-71794-2

Vol. 217. Masoud Nikravesh, JanuszKacprzyk, Lofti A. Zadeh (Eds.)Forging New Frontiers: FuzzyPioneers I, 2007ISBN 978-3-540-73181-8

Page 3: Forging New Frontiers: Fuzzy Pioneers I

Masoud Nikravesh · Janusz Kacprzyk ·Lofti A. ZadehEditors

Forging New Frontiers:Fuzzy Pioneers I

With 200 Figures and 52 Tables

Page 4: Forging New Frontiers: Fuzzy Pioneers I

Masoud Nikravesh Janusz KacprzykUniversity of California Berkeley PAN WarszawaDepartment of Electrical Engineering Systems Research Instituteand Computer Science - EECS Newelska 6

94720Berkeley 01-447 WarszawaUSA [email protected] [email protected]

Lofti A. ZadehUniversity of California BerkeleyDepartment of Electrical Engineeringand Computer Science (EECS)Soda Hall 729

[email protected]

Library of Congress Control Number: 2007930457

ISSN print edition: 1434-9922ISSN electronic edition: 1860-0808ISBN 978-3-540-73181-8 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material isconcerned, speci cally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting,reproduction on micro lm or in any other way, and storage in data banks. Duplication of this publicationor parts thereof is permitted only under the provisions of the German Copyright Law of September 9,1965, in its current version, and permission for use must always be obtained from Springer. Violationsare liable for prosecution under the German Copyright Law.

Springer is a part of Springer Science+Business Mediaspringer.comc© Springer-Verlag Berlin Heidelberg 2007

The use of general descriptive names, registered names, trademarks, etc. in this publication does notimply, even in the absence of a speci c statement, that such names are exempt from the relevant protectivelaws and regulations and therefore free for general use.

Typesetting: Integra Software Services Pvt. Ltd., IndiaCover design: WMX Design, Heidelberg

Printed on acid-free paper SPIN: 11865285 89/3180/Integra 5 4 3 2 1 0

Page 5: Forging New Frontiers: Fuzzy Pioneers I

To Fuzzy Community

Page 6: Forging New Frontiers: Fuzzy Pioneers I

Preface

The 2005 BISC International Special Event-BISCSE’05 “FORGING THEFRONTIERS” was held in the University of California, Berkeley, “WHERE FUZZYLOGIC BEGAN, from November 3–6, 2005. The successful applications of fuzzylogic and it’s rapid growth suggest that the impact of fuzzy logic will be felt increas-ingly in coming years. Fuzzy logic is likely to play an especially important rolein science and engineering, but eventually its influence may extend much farther.In many ways, fuzzy logic represents a significant paradigm shift in the aims ofcomputing - a shift which reflects the fact that the human mind, unlike present daycomputers, possesses a remarkable ability to store and process information which ispervasively imprecise, uncertain and lacking in categoricity.

The BISC Program invited pioneers, the most prominent contributors, researchers,and executives from around the world who are interested in forging the frontiersby the use of fuzzy logic and soft computing methods to create the informationbackbone for global enterprises. This special event provided a unique and excellentopportunity for the academic, industry leaders and corporate communities to addressnew challenges, share solutions, and discuss research directions for the future withenormous potential for the future of the Global Economy.

During this event, over 300 hundred researchers and executives were attended.They were presented their most recent works, their original contribution, and histor-ical trends of the field. In addition, four high level panels organized to discuss theissues related to the field, trends, past, present and future of the field.

Panel one was organized and moderated by Janusz Kacprzyk and Vesa A.Niskanen, “SOFT COMPUTING: PAST, PRESENT, FUTURE (GLOBAL ISSUE)”.The Panelists were Janusz Kacprzyk, Vesa A. Niskanen, Didier Dubois, MasoudNikravesh, Hannu Nurmi, Rudi Seising, Richard Tong, Enric Trillas, and JunzoWatada. The panel discussed general issues such as role of SC in social and be-havioral sciences, medicin, economics and philosophy (esp. philosophy of science,methodology, and ethics). Panel also considered aspects of decision making, democ-racy and voting as well as manufacturing, industrial and business aspects.

Second Panel was organized and moderated by Elie Sanchez and MasoudNikravesh, “FLINT AND SEMANTIC WEB”. The Panelists were C. Carlsson,N. Kasabov, T. Martin, M. Nikravesh, E. Sanchez, A. Sheth, R. R. Yager andL.A. Zadeh. The Organizers think that, it is exciting time in the fields of FuzzyLogic and Internet (FLINT) and the Semantic Web. The panel discussion added to

VII

Page 7: Forging New Frontiers: Fuzzy Pioneers I

VIII Preface

the excitement, as it did focus on the growing connections between these two fields.Most of today’s Web content is suitable for human consumption. The Semantic Webis presented as an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Butwhile the vision of the Semantic Web and associated research attracts attention,as long as bivalent-based logical methods will be used, little progress can be ex-pected in handling ill-structured, uncertain or imprecise information encountered inreal world knowledge. During recent years, important initiatives have led to reportsof connections between Fuzzy Logic and the Internet (FLINT). FLINT meetings(“Fuzzy Logic and the Internet”) have been organized by BISC (“Berkeley Initiativein Soft Computing”). Meanwhile, scattered papers were published on Fuzzy Logicand the Semantic Web. Special sessions and workshops were organized, showingthe positive role Fuzzy Logic could play in the development of the Internet andSemantic Web, filling a gap and facing a new challenge. Fuzzy Logic field has beenmaturing for forty years. These years have witnessed a tremendous growth in thenumber and variety of applications, with a real-world impact across a wide varietyof domains with humanlike behavior and reasoning. And we believe that in thecoming years, the FLINT and Semantic Web will be major fields of applications ofFuzzy Logic. This panel session discussed concepts, models, techniques and exam-ples exhibiting the usefulness, and the necessity, of using Fuzzy Logic and Internetand Semantic Web. In fact, the question is not really a matter of necessity, but torecognize where, and how, it is necessary.

Third Panel was organized and moderated by Patrick Bosc and MasoudNikravesh. The Panelist were P. Bosc, R. de Caluwe, D. Kraft, M. Nikravesh,F. Petry, G. de Tré, Raghu Krishnapuram, and Gabriella Pasi, FUZZY SETSIN INFORMATION SYSTEMS. Fuzzy sets approaches have been applied in thedatabase and information retrieval areas for more than thirty years. A certain degreeof maturity has been reached in terms of use of fuzzy sets techniques in relationaldatabases, object-oriented databases, information retrieval systems, geographic in-formation systems and the systems dealing with the huge quantity of informationavailable through the Web. Panel discussed issues related to research works in-cluding database design, flexible querying, imprecise database management, digitallibraries or web retrieval have been undertaken and their major outputs will be em-phasized. On the other hand, aspects that seem to be promising for further researchhas been identified.

The forth panel was organized and moderated by L. A. Zadeh and MasoudNikravesh, A GLIMPSE INTO THE FUTURE. Panelists were Janusz Kacprzyk,K. Hirota, Masoud Nikravesh, Henri Prade, Enric Trillas, Burhan Turksen, andLotfi A. Zadeh. Predicting the future of fuzzy logic is difficult if fuzzy logic isinterpreted in its wide sense, that is, a theory in which everything is or is allowedto be a matter of degree. But what is certain is that as we move further into theage of machine intelligence and mechanized decision-making, both theoretical andapplied aspects of fuzzy logic will gain in visibility and importance. What will standout is the unique capability of fuzzy logic to serve as a basis for reasoning and com-putation with information described in natural language. No other theory has thiscapability.

Page 8: Forging New Frontiers: Fuzzy Pioneers I

Preface IX

The chapters of the book are evolved from presentations made by selected par-ticipant at the meeting and organized in two books. The papers include report fromthe different front of soft computing in various industries and address the problemsof different fields of research in fuzzy logic, fuzzy set and soft computing. The bookprovides a collection of forty two (42) articles in two volumes.

We would like to take this opportunity to thanks all the contributors and reviewersof the articles. We also wish to acknowledge our colleagues who have contributed tothe area directly and indirectly to the content of this book. Finally, we gratefully ac-knowledge BT, OMRON, Chevron, ONR, and EECS Department, CITRIS program,BISC associate members, for the financial and technical support and specially, Prof.Shankar Sastry—CITRIS Director and former EECS Chair, for his special supportfor this event, which made the meeting and publication and preparation of the bookpossible.

Masoud Nikravesh, Lotfi A. Zadeh,and Janusz Kacprzyk

Berkeley Initiative in Soft Computing (BISC)University of California, Berkeley

November 29, 2006Berkeley, CaliforniaUSA

Page 9: Forging New Frontiers: Fuzzy Pioneers I

Contents

Web Intelligence, World Knowledge and Fuzzy LogicLotfi A. Zadeh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Towards Human Consistent Linguistic Summarization of Time Seriesvia Computing with Words and PerceptionsJanusz Kacprzyk, Anna Wilbik and Sławomir Zadrozny . . . . . . . . . . . . . . . . . . 17

Evolution of Fuzzy Logic: From Intelligent Systems and Computationto Human MindMasoud Nikravesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century1

Rudolf Seising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Selected Results of the Global Survey on Research, Instruction andDevelopment Work with Fuzzy SystemsVesa A. Niskanen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Fuzzy Models and InterpolationLászló T. Kóczy, János Botzheim and Tamás D. Gedeon . . . . . . . . . . . . . . . . . 111

Computing with AntonymsE. Trillas, C. Moraga, S. Guadarrama, S. Cubillo, E. Castiñeira . . . . . . . . . . . . 133

Morphic Computing: Concept and FoundationGermano Resconi and Masoud Nikravesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

A Nonlinear Functional Analytic Framework for Modeling andProcessing Fuzzy SetsRui J. P. de Figueiredo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

Concept-Based Search and Questionnaire SystemsMasoud Nikravesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Towards Perception Based Time Series Data MiningIldar Z. Batyrshin and Leonid Sheremetov . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Page 10: Forging New Frontiers: Fuzzy Pioneers I

XII Contents

Veristic Variables and Approximate Reasoning for Intelligent SemanticWeb SystemsRonald R. Yager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Uncertainty in Computational Perception and CognitionAshu M. G. Solo and Madan M. Gupta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

Computational Intelligence for Geosciences and Oil ExplorationMasoud Nikravesh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

Hierarchical Fuzzy Classification of Remote Sensing DataYan Wang, Mo Jamshidi, Paul Neville, Chandra Bales and Stan Morain . . . . . 333

Real World Applications of a Fuzzy Decision Model Based onRelationships between Goals (DMRG)Rudolf Felix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural SciencesMarius Calin and Constantin Leonte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

Soft Computing in the Chemical Industry: Current State of the Art andFuture TrendsArthur Kordon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397

Identifying Aggregation Weights of Decision Criteria: Application ofFuzzy Systems to Wood Product ManufacturingOzge Uncu, Eman Elghoneimy, William A. Gruver, Dilip B Kotakand Martin Fleetwood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415

On the Emergence of Fuzzy Logics from Elementary Percepts: theChecklist Paradigm and the Theory of PerceptionsLadislav J. Kohout and Eunjin Kim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437

Page 11: Forging New Frontiers: Fuzzy Pioneers I

List of Contributors

Anna WilbikSystems Research Institute, PolishAcademy of Sciences, ul. Newelska 6,01-447 Warsaw, [email protected]

Arthur KordonEngineering&Process Sciences, CoreR&D, The Dow Chemical Company,2301 N Brazosport Blvd, Freeport, TX77541, [email protected]

Ashu M. G. SoloMaverick Technologies America Inc.,Suite 808, 1220 North Market Street,Wilmington, Delaware, U.S.A. [email protected]

S. Cubillo · E. CastiñeiraTechnical University of Madrid,Department of Applied Mathematics,Madrid, Spain

Dilip B Kotak · Martin FleetwoodInstitute for Fuel Cell Innovation,National Research Council, Vancouver,BC, Canada,{dilip.kotak,martin.fleetwood}@nrc-cnrc.gc.ca

Eunjin KimDepartment of Computer Science,John D. Odegard School of AerospaceScience, University of North Dakota,Grand Forks ND 58202-9015, USA

Germano ResconiCatholic University, Brescia, [email protected]

Ildar Z. Batyrshin · LeonidSheremetovResearch Program in Applied Math-ematics and Computing (PIMAyC),Mexican Petroleum Institute, EjeCentral Lazaro Cardenas 152, Col.San Bartolo Atepehuacan, C.P. 07730,Mexico, D.F., Mexico,{batyr,[email protected]}

János BotzheimDepartment of Telecommunications andMedia Informatics, Budapest Universityof Technology and Economics, HungaryDepartment of Computer Science, TheAustralian National University

Janusz Kacprzyk · SławomirZadroznySystems Research Institute, PolishAcademy of Sciences, ul. Newelska 6,01-447 Warsaw, PolandWarsaw School of InformationTechnology (WIT), ul. Newelska 6,01-447 Warsaw, [email protected]@@ibspan.waw.pl

XIII

Page 12: Forging New Frontiers: Fuzzy Pioneers I

XIV List of Contributors

Ladislav J. KohoutDepartment of Computer Science,Florida State University, Tallahassee,Florida 32306-4530, USA

László T. KóczyDepartment of Telecommunications andMedia Informatics, Budapest Universityof Technology and Economics, HungaryInstitute of Information Technologyand Electrical Engineering, SzéchenyiIstván University, Gyor, Hungary

Lotfi A. ZadehBISC Program, Computer SciencesDivision, EECS Department, Universityof California, Berkeley, CA 94720,[email protected]

Madan M. GuptaIntelligent Systems Research Lab-oratory College of Engineering,University of Saskatchewan, Saskatoon,Saskatchewan, Canada S7N [email protected]

Marius Calin · Constantin LeonteThe “Ion Ionescu de la Brad”,University of Agricultural Sciencesand Veterinary Medicine of Iasi, AleeaMihail Sadoveanu no. 3, Iasi 700490,Romania{mcalin,cleonte}@univagro-iasi.ro

Masoud NikraveshBISC Program, Computer SciencesDivision, EECS Department Universityof California, Berkeley, and NERSC-Lawrence Berkeley National Lab.,California 94720–1776, [email protected]

Masoud NikraveshBISC Program, EECS Department,University of California, Berkeley, CA94720, [email protected]

Masoud NikraveshBISC Program, Computer SciencesDivision, EECS Department andImaging and Informatics Group-LBNL,University of California, Berkeley, CA94720, [email protected]

Masoud NikraveshBerkeley Initiative in Soft Computing(BISC) rogram, Computer ScienceDivision- Department of EECS,University of California, Berkeley, [email protected]

Mo JamshidiDepartment of Electrical and ComputerEngineering and ACE Center,University of Texas at San Antonio, SanAntonio, TX, [email protected]

Ozge Uncu · Eman Elghoneimy ·William A. GruverSchool of Engineering Science, SimonFraser University, 8888 University Dr.Burnaby BC, Canada,{ouncu, eelghone,gruver}@sfu.ca

Paul Neville · Chandra Bales · StanMorainThe Earth Data Analysis Center(EDAC), University of New Mexico,Albuquerque, New Mexico, 87131{pneville, cbales,smorain}@edac.unm.edu

Ronald R. YagerMachine Intelligence Institute, IonaCollege, New Rochelle, NY [email protected]

Page 13: Forging New Frontiers: Fuzzy Pioneers I

List of Contributors XV

Rudolf FelixFLS Fuzzy Logik Systeme GmbHJoseph-von-Fraunhofer Straße 20, 44227 Dortmund, [email protected]

Rudolf SeisingMedical University of Vienna, CoreUnit for Medical Statistics andInformatics, Vienna, [email protected]

Rui J. P. de FigueiredoLaboratory for Intelligent SignalProcessing and Communications,California Institute for Telecommuni-cations and Information Technology,University of California Irvine, Irvine,CA, 92697-2800

Tamás D. Gedeon Department ofComputer Science, The AustralianNational University

E. Trillas · C. Moraga · S. Guadar-ramaTechnical University of Madrid,Department of Artificial Intelligence

Vesa A. NiskanenDepartment of Economics & Manage-ment, University of Helsinki, PO Box27, 00014 Helsinki, [email protected]

Yan WangIntelligent Inference Systems Corp.,NASA Research Park, MS: 566-109,Moffett Field, CA, [email protected]

Page 14: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledgeand Fuzzy Logic

Lotfi A. Zadeh

Abstract Existing search engines—with Google at the top—have many remarkablecapabilities; but what is not among them is deduction capability—the capability tosynthesize an answer to a query from bodies of information which reside in variousparts of the knowledge base. In recent years, impressive progress has been made inenhancing performance of search engines through the use of methods based on biva-lent logic and bivalent-logic-based probability theory. But can such methods be usedto add nontrivial deduction capability to search engines, that is, to upgrade searchengines to question-answering systems? A view which is articulated in this note isthat the answer is “No.” The problem is rooted in the nature of world knowledge,the kind of knowledge that humans acquire through experience and education.

It is widely recognized that world knowledge plays an essential role in assess-ment of relevance, summarization, search and deduction. But a basic issue whichis not addressed is that much of world knowledge is perception-based, e.g., “it ishard to find parking in Paris,” “most professors are not rich,” and “it is unlikelyto rain in midsummer in San Francisco.” The problem is that (a) perception-basedinformation is intrinsically fuzzy; and (b) bivalent logic is intrinsically unsuited todeal with fuzziness and partial truth.

To come to grips with the fuzziness of world knowledge, new tools are needed.The principal new tool—a tool which is briefly described in their note—is Precisi-ated Natural Language (PNL). PNL is based on fuzzy logic and has the capabilityto deal with partiality of certainty, partiality of possibility and partiality of truth.These are the capabilities that are needed to be able to draw on world knowledge forassessment of relevance, and for summarization, search and deduction.

Lotfi A. ZadehBISC Program, Computer Sciences Division, EECS Department, University of California,Berkeley, CA 94720, USAe-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 1C© Springer 2007

Page 15: Forging New Frontiers: Fuzzy Pioneers I

2 L. A. Zadeh

1 Introduction

In moving further into the age of machine intelligence and automated reasoning,we have reached a point where we can speak, without exaggeration, of systemswhich have a high machine IQ (MIQ) (Zadeh, [17]). The Web and especially searchengines—with Google at the top—fall into this category. In the context of the Web,MIQ becomes Web IQ, or WIQ, for short.

Existing search engines have many remarkable capabilities. However, what isnot among them is the deduction capability—the capability to answer a query bya synthesis of information which resides in various parts of the knowledge base. Aquestion-answering system is by definition a system which has this capability. Oneof the principal goals of Web intelligence is that of upgrading search engines toquestion-answering systems. Achievement of this goal requires a quantum jump inthe WIQ of existing search engines [1].

Can this be done with existing tools such as the Semantic Web [3], Cyc [8],OWL [13] and other ontology-centered systems [12, 14]—tools which are basedon bivalent logic and bivalent-logic-based probability theory? It is beyond questionthat, in recent years, very impressive progress has been made through the use of suchtools. But can we achieve a quantum jump in WIQ? A view which is advanced in thefollowing is that bivalent-logic- based methods have intrinsically limited capabilityto address complex problems which arise in deduction from information which ispervasively ill-structured, uncertain and imprecise.

The major problem is world knowledge—the kind of knowledge that humans ac-quire through experience and education [5]. Simple examples of fragments of worldknowledge are: Usually it is hard to find parking near the campus in early morningand late afternoon; Berkeley is a friendly city; affordable housing is nonexistent inPalo Alto; almost all professors have a Ph.D. degree; most adult Swedes are tall;and Switzerland has no ports.

Much of the information which relates to world knowledge—and especially tounderlying probabilities—is perception-based (Fig. 1). Reflecting the bounded abil-ity of sensory organs, and ultimately the brain, to resolve detail and store infor-mation, perceptions are intrinsically imprecise. More specifically, perceptions aref-granular in the sense that (a) the boundaries of perceived classes are unsharp; and(b) the values of perceived attributes are granular, with a granule being a clump ofvalues drawn together by indistinguishability, similarity, proximity or functional-ity [2].

Imprecision of perception-based information is a major obstacle to dealing withworld knowledge through the use of methods based on bivalent logic and bivalent-logic-based probability theory—both of which are intolerant of imprecision andpartial truth. What is the basis for this contention? A very simple example offersan explanation.

Suppose that I have to come up with a rough estimate of Pat’s age. The informa-tion which I have is: (a) Pat is about ten years older than Carol; and (b) Carol hastwo children: a son, in mid-twenties; and a daughter, in mid-thirties.

How would I come up with an answer to the query q: How old is Pat? First,using my world knowledge, I would estimate Carol’s age, given (b); then I would

Page 16: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 3

• it is 35 C°

• Eva is 28• Tandy is three years

older than Dana

• It is very warm

• Eva is young• Tandy is a few

years older than Dana

• it is cloudy

• traffic is heavy

• Robert is very honest

INFORMATION

MEASUREMENT-BASED VS. PERCEPTION-BASED INFORMATION

measurement-based numerical

perception-based linguistic

Fig. 1 Measurement-based and perception-based information

add “about ten years,” to the estimate of Carol’s age to arrive at an estimate of Pat’sage. I would be able to describe my estimate in a natural language, but I would notbe able to express it as a number, interval or a probability distribution.

How can I estimate Carol’s age given (b)? Humans have an innate ability toprocess perception-based information—an ability that bivalent-logic-based methodsdo not have; nor would such methods allow me to add “about ten years” to myestimate of Carol’s age.

There is another basic problem—the problem of relevance. Suppose that insteadof being given (a) and (b), I am given a collection of data which includes (a) and(b), and it is my problem to search for and identify the data that are relevant tothe query. I came across (a). Is it relevant to q? By itself, it is not. Then I cameacross (b). Is it relevant to q? By itself, it is not. Thus, what I have to recognize isthat, in isolation, (a) and (b) are irrelevant, that is, uninformative, but, in combina-tion, they are relevant i.e., are informative. It is not difficult to recognize this in thesimple example under consideration, but when we have a large database, the prob-lem of identifying the data which in combination are relevant to the query, is verycomplex.

Suppose that a proposition, p, is, in isolation, relevant to q , or, for short, isi -relevant to q . What is the degree to which p is relevant to q? For example, towhat degree is the proposition: Carol has two children: a son, in mid-twenties, anda daughter, in mid-thirties, relevant to the query: How old is Carol? To answer thisquestion, we need a definition of measure of relevance. The problem is that there isno quantitative definition of relevance within existing bivalent-logic-based theories.We will return to the issue of relevance at a later point.

The example which we have considered is intended to point to the difficulty ofdealing with world knowledge, especially in the contexts of assessment of relevanceand deduction, even in very simple cases. The principal reason is that much of worldknowledge is perception-based, and existing theories of knowledge representation

Page 17: Forging New Frontiers: Fuzzy Pioneers I

4 L. A. Zadeh

and deduction provide no tools for this purpose. Here is a test problem involvingdeduction form perception-based information.

1.1 The Tall Swedes Problem

Perception: Most adult Swedes are tall, with adult defined as over about 20 years inage.

Query: What is the average height of Swedes?A fuzzy-logic-based solution of the problem will be given at a later point.A concept which provides a basis for computation and reasoning with perception-

based information is that of Precisiated Natural Language (PNL) [15]. The capabil-ity of PNL to deal with perception-based information suggests that it may play asignificant role in dealing with world knowledge. A quick illustration is the Carolexample. In this example, an estimate of Carol’s age, arrived at through the use ofPNL would be expressed as a bimodal distribution of the form

Age(Carol) is ((P1, V1)+ . . .+ (Pn, Vn)) , i = 1, . . . , n

where the Vi are granular values of Age, e.g., less than about 20, between about20 and about 30, etc.; Pi is a granular probability of the event (Age(Carol) is Vi ),i = 1, . . . , n; and+ should be read as “and.” A brief description of the basics ofPNL is presented in the following.

2 Precisiated Natural Language (PNL)

PNL deals with perceptions indirectly, through their description in a natural lan-guage, Zadeh [15]. In other words, in PNL a perception is equated to its descriptionin a natural language. PNL is based on fuzzy logic—a logic in which everything is,or is allowed to be, a matter of degree. It should be noted that a natural language is,in effect, a system for describing perceptions.

The point of departure in PNL is the assumption that the meaning of a proposi-tion, p, in a natural language, NL, may be represented as a generalized constraint ofthe form (Fig. 2)

X isr R,

where X is the constrained variable, R is a constraining relation which, in gen-eral, is not crisp (bivalent); and r is an indexing variable whose values define themodality of the constraint. The principal modalities are: possibilistic (r=blank);veristic(r=v); probabilistic(r=p); random set(r=rs); fuzzy graph (r= f g); usuality(r=u); and Pawlak set (r=ps). The set of all generalized constraints together withtheir combinations, qualifications and rules of constraint propagation, constitutes

Page 18: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 5

•standard constraint: X ∈∈∈ C•generalized constraint:X isr R

X isr R

copula

GC-form (generalized constraint form of type r)

modality identifier

constraining relation

constrained variable

•X = (X1 , …, Xn )•X may have a structure: X = Location (Residence(Carol))•X may be a function of another variable: X = f(Y)•X may be conditioned: (X/Y)• .../ps/fg/rs/u/p/v/blank///.../:r ⊃⊂≤≤=

Fig. 2 Generalized Constraint

the Generalized Constraint Language (GCL). By construction, GCL is maximallyexpressive. In general, X , R and r are implicit in p. Thus, in PNL the meaning ofp is precisiated through explicitation of the generalized constraint which is implicitin p, that is, through translation into GCL (Fig. 3). Translation of p into GCL isexemplified by the following.

(a) Monika is young−→ Age(Monika) is young,

where young is a fuzzy relation which is characterized by its membership functionμyoung, with μyoung (u) representing the degree to which a numerical value of age,u, fits the description of age as “young.”

(b) Carol lives in a small city near San Francisco −→ Location (Residence(Carol))is SMALL [City; μ] ∩ NEAR [City; μ],

where SMALL [City; μ] is the fuzzy set of small cities; NEAR [City; μ] is the fuzzyset of cities near San Francisco; and ∩ denotes fuzzy set intersection (conjunction).

(c) Most Swedes are tall −→ �Count(tall.Swedes/Swedes) is most.

In this example, the constrained variable is the relative �count of tall Swedesamong Swedes, and the constraining relation is the fuzzy quantifier “most,” with

Fig. 3 Calibration of mostand usually represented astrapezoidal fuzzy numbers 0

1

number

μ

0.5 1

most

usually

Page 19: Forging New Frontiers: Fuzzy Pioneers I

6 L. A. Zadeh

“most” represented as a fuzzy number (Fig. 3). The relative �count, �Count(A/B),is defined as follows. If A and B are fuzzy sets in a universe of discourse U ={ui , . . . , un}, with the grades of membership of ui in A and B being μi and vi ,respectively, then by definition the relative �Count of A in B is expressed as

�Count(A/B) = �i μi ∧ vi

�i

vi,

where the conjunction, ∧, is taken to be min. More generally, conjunction may betaken to be a t-norm [11].

What is the rationale for introducing the concept of a generalized constraint?Conventional constraints are crisp (bivalent) and have the form X ε C , where Xis the constrained variable and C is a crisp set. On the other hand, perceptions aref-granular, as was noted earlier. Thus, there is a mismatch between f-granularity ofperceptions and crisp constraints. What this implies is that the meaning of a percep-tion does not lend itself to representation as a collection of crisp constraints; whatis needed for this purpose are generalized constraints or, more generally, an elementof the Generalized Constraint Language (GCL) [15].

Representation of the meaning of a perception as an element of GCL is a first steptoward computation with perceptions—computation which is needed for deductionfrom data which are resident in world knowledge. In PNL, a concept which playsa key role in deduction is that of a protoform—an abbreviation of “prototypicalform,” [15].

If p is a proposition in a natural language, NL, its protoform, PF( p), is an abstrac-tion of p which places in evidence the deep semantic structure of p. For example,the protoform of “Monika is young,” is “A(B) is C ,” where A is abstraction of“Age,” B is abstraction of “Monika” and C is abstraction of “young.” Similarly,

Most Swedes are tall −→ Count(A/B) is Q,

where A is abstraction of “tall Swedes,” B is abstraction of “Swedes” and Q isabstraction of “most.” Two propositions, p and q , are protoform-equivalent, orPF-equivalent, for short, if they have identical protoforms. For example, p: MostSwedes are tall, and q: Few professors are rich, are PF-equivalent.

The concept of PF-equivalence suggests an important mode of organization ofworld knowledge, namely protoform-based organization. In this mode of organiza-tion, items of knowledge are grouped into PF-equivalent classes. For example, onesuch class may be the set of all propositions whose protoform is A(B) is C , e.g.,Monika is young. The partially instantiated class Price(B) is low, would be the set ofall objects whose price is low. As will be seen in the following, protoform-based or-ganization of world knowledge plays a key role in deduction form perception-basedinformation.

Basically, abstraction is a means of generalization. Abstraction is a familiar andwidely used concept. In the case of PNL, abstraction plays an especially importantrole because PNL abandons bivalence. Thus, in PNL, the concept of a protoform is

Page 20: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 7

THE BASIC IDEAP NL GCL

p NL(p)

GCL PFL

PF(p)

GC(p)

GC(p)

perception description of perception

precisiationof perception

precisiationof perception

abstraction

precisiationdescription

GCL (Generalized Constrain Language) is maximally expressive

Fig. 4 Precisiation and abstraction

not limited to propositions whose meaning can be represented within the conceptualstructure of bivalence logic.

In relation to a natural language, NL, the elements of GCL may be viewed asprecisiations of elements of NL. Abstractions of elements of GCL gives rise to whatis referred to as Protoform Language, PFL, (Fig. 4). A consequence of the conceptof PF-equivalence is that the cardinality of PFL is orders of magnitude smaller thanthat of GCL, or, equivalently, the set of precisiable propositions in NL. The smallcardinality of PFL plays an essential role in deduction.

The principal components of the structure of PNL (Fig. 5) are: (1) a dictionaryfrom NL to GCL; (2) a dictionary from GCL to PFL (Fig. 6); (3) a multiagent,modular deduction database, DDB; and (4) a world knowledge database, WKDB.The constituents of DDB are modules, with a module consisting of a group ofprotoformal rules of deduction, expressed in PFL (Fig. 7), which are drawn froma particular domain, e.g., probability, possibility, usuality, fuzzy arithmetic, fuzzy

p• • •p* p**

NL PFLGCL

GC(p) PF(p)

precisiation

dictionary 1 dictionary 2

world knowledgedatabase

DDB

deduction database

WKDB

Fig. 5 Basic structure of PNL

Page 21: Forging New Frontiers: Fuzzy Pioneers I

8 L. A. Zadeh

1:

most Swedes are tall Σ Count (tall.Swedes/Swedes) is most

p p* (GC-form)

proposition in NL precisiation

2:

Σ Count (tall.Swedes/Swedes) is most

p* (GC-form)

protoformprecisiationPF(p*)

Q A’s are B’s

Fig. 6 Structure of PNL: dictionaries

logic, search, etc. For example, a rule drawn from fuzzy logic is the compositionalrule of inference [11], expressed as

X is A

(X, Y ) is B

Y is A ◦ B

symbolic part

μA◦B(v) = maxu(μA(u) ∧ μB(u, v))

computational part

where A◦B is the composition of A and B , defined in the computational part, inwhich μA, μB and μA◦B are the membership functions of A, B and A◦B, respec-tively. Similarly, a rule drawn from probability is

Prob (X is A) is B

Prob (X is C) is D

symbolic part

μD(v) = maxq(μB(∫U

μA(u)g(u)du))

subject to: v = ∫U

μC (u)g(u)du

∫U

g(u)du = 1

computational part

where D is defined in the computational part and g is the probability density functionof X .

The rules of deduction in DDB are, basically, the rules which govern propagationof generalized constraints. Each module is associated with an agent whose functionis that of controlling execution of rules and performing embedded computations.The top-level agent controls the passing of results of computation from a moduleto other modules. The structure of protoformal, i.e., protoform-based, deduction isshown in Fig. 5. A simple example of protoformal deduction is shown in Fig. 8.

Page 22: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 9

MODULAR DEDUCTION DATABASE

POSSIBILITYMODULE

PROBABILITY MODULE

SEARCH MODULE

FUZZY LOGIC MODULE

agent

FUZZY ARITHMETIC MODULE

EXTENSION PRINCIPLE MODULE

Fig. 7 Basic structure of PNL: modular deduction database

The principal deduction rule in PNL is the extension principle [19]. Its symbolicpart is expressed as

f (X) is A

g(X) is B

in which the antecedent is a constraint on X through a function of X , f (X); and theconsequent is the induced constraint on a function of X , g(X).

The computational part of the rule is expressed as

μB (v) = supu (μA (f(u))

subject to

v = g(u)

Dana is young Age (Dana) is young X is A

p GC(p) PF(p)

Tandy is a few years older than Dana

Age (Tandy) is (Age (Dana)) Y is (X+B)

X is AY is (X+B)Y is A+B

Age (Tandy) is (young+few)

)uv()u((sup)v( BAuBA -∧ μμμ =+

+few

Fig. 8 Example of protoformal reasoning

Page 23: Forging New Frontiers: Fuzzy Pioneers I

10 L. A. Zadeh

To illustrate the use of the extension principle, we will consider the Tall SwedesProblem:

p: Most adult Swedes are tallq: What is the average height of adult Swedes?

Let P1(u1, . . . , uN ) be a population of Swedes, with the height of ui being hi ,i = 1, . . . , N , and μa(ui ) representing the degree to which ui is an adult. Theaverage height of adult Swedes is denoted as have. The first step is precisiation ofp, using the concept of relative �Count:

p −→ �Count(tall∧adult.Swedes/adult.Swedes) is most

or, more explicitly,

p −→�i

μtall(hi ) ∧ μa(ui )

�i μa(ui )is most.

The next step is precisiation of q:

q −→ have = 1

N�i hi

have is ?B,

where B is a fuzzy numberUsing the extension principle, computation of have as a fuzzy number is reduced

to the solution of the variational problem.

μB(v) = suph(μmost(�i μtall(hi ) ∧ μa(ui )

�i μa(ui ))) , h = (h1, . . . , hN )

subject to

v = 1

N(�i hi ).

Note that we assume that a problem is solved once it is reduced to the solution of awell-defined computational problem.

3 PNL as a Definition Language

One of the important functions of PNL is that of serving as a high level definitionlanguage. More concretely, suppose that I have a concept, C , which I wish to define.For example, the concept of distance between two real-valued functions, f and g,defined on the real line.

Page 24: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 11

The standard approach is to use a metric such as L1 or L2. But a standard metricmay not capture my perception of the distance between f and g. The PNL-basedapproach would be to describe my perception of distance in a natural language andthen precisiate the description.

This basic idea opens the door to (a) definition of concepts for which no satis-factory definitions exist, e.g., the concepts of causality, relevance and rationality,among many others, and (b) redefinition of concepts whose existing definitions donot provide a good fit to reality. Among such concepts are the concepts of similarity,stability, independence, stationarity and risk.

How can we concretize the meaning of “good fit?” In what follows, we do thisthrough the concept of cointension.

More specifically, let U be a universe of discourse and let C be a concept whichI wish to define, with C relating to elements of U . For example, U is a set of build-ings and C is the concept of tall building. Let p(C) and d(C) be, respectively, myperception and my definition of C . Let I (p(C)) and I (d(C)) be the intensions ofp(C) and d(C), respectively, with “intension” used in its logical sense, [6, 7] that is,as a criterion or procedure which identifies those elements of U which fit p(C) ord(C). For example, in the case of tall buildings, the criterion may involve the heightof a building.

Informally, a definition, d(C), is a good fit or, more precisely, is cointensive, ifits intension coincides with the intension of p(C). A measure of goodness of fit isthe degree to which the intension of d(C) coincides with that of p(C). In this sense,cointension is a fuzzy concept. As a high level definition language, PNL makes itpossible to formulate definitions whose degree of cointensiveness is higher than thatof definitions formulated through the use of languages based on bivalent logic.

A substantive exposition of PNL as a definition language is beyond the scope ofthis note. In what follows, we shall consider as an illustration a relatively simpleversion of the concept of relevance.

4 Relevance

We shall examine the concept of relevance in the context of a relational model suchas shown in Fig. 9. For concreteness, the attributes A1, . . . , An may be interpretedas symptoms and D as diagnosis. For convenience, rows which involve the samevalue of D are grouped together. The entries are assumed to be labels of fuzzy sets.For example, A5 may be blood pressure and a53 may be “high.”

An entry represented as * means that the entry in question is conditionally re-dundant in the sense that its value has no influence on the value of D (Fig. 10).

An attribute, A j , is redundant, and hence deletable, if it is conditionally redun-dant for all values of Name. An algorithm, termed compactification, which identifiesall deletable attributes is described in Zadeh [18]. Compactification algorithm is ageneralization of the Quine-McCluskey algorithm for minimization of switchingcircuits. The Redact algorithm in the theory of rough sets [10] is closely related tothe compactification algorithm.

Page 25: Forging New Frontiers: Fuzzy Pioneers I

12 L. A. Zadeh

RELEVANCE, REDUNDANCE AND DELETABILITY

dr

.

dl

.

d2

d1

.

d1

D

amnamjam1Namen

....

alnaljal1Namel

....

ak+1, nak+1, jak+1, 1Namek+1

aknakjak1Namek

....

aina1ja11Name1

AnAjA1Name

DECISION TABLE

Aj: j th symptom

aij: value of j th symptom ofName

D: diagnosis

Fig. 9 A relational model of decision

Fig. 10 Conditionalredundance and redundance

REDUNDANCE DELETABILITY

.

d2

.

D

....

arn*ar1Namer

....

AnAjA1Name

Aj is conditionally redundant for Namer, A, is ar1, An is arnIf D is ds for all possible values of Aj in *

Aj is redundant if it is conditionally redundant for all values of Name

• compactification algorithm (Zadeh, 1976); Quine-McCluskey algorithm

The concept of relevance (informativeness) is weaker and more complex thanthat of redundance (deletability). As was noted earlier, it is necessary to differen-tiate between relevance in isolation (i -relevance) and relevance as a group. In thefollowing, relevance should be interpreted as i -relevance.

A value of A1, say ar j , is irrelevant (uninformative) if the proposition A j is ar j

does not constrain D (Fig. 11). For example, the knowledge that blood pressure ishigh may convey no information about the diagnosis (Fig. 12).

An attribute, A j , is irrelevant (uninformative) if, for all ar j , the proposition A j

Fig. 11 Irrelevance

d2

.

d2

d1

.

d1

D

.aij.Name

i+s

.aij.Name

r

AnAjA1Name (Aj is aij) is irrelevant

(uninformative)

Page 26: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 13

Fig. 12 Relevance andirrelevance

Aj is irrelevant if it Aj is uniformative for all arj

D is ?d if Aj is arj

constraint on Aj induces a constraint on Dexample: (blood pressure is high) constrains D(Aj is arj) is uniformative if D is unconstrained

irrelevance deletability

is ar j does not constrain D. What is important to note is that irrelevance does notimply deletability, as redundance does. The reason is that A j may be i -irrelevantbut not irrelevant in combination with other attributes. An example is shown inFig. 13.

As defined above, relevance and redundance are bivalent concepts, with no de-grees of relevance or redundance allowed. But if the definitions in question areinterpreted as idealizations, then a measure of the departure from the ideal couldbe used as a measure of the degree of relevance or redundance. Such a measurecould be defined through the use of PNL. In this way, PNL may provide a basisfor defining relevance and redundance as matters of degree, as they are in realisticsettings. However, what should be stressed is that our analysis is limited to relationalmodels. Formalizing the concepts of relevance and redundance in the context of theWeb is a far more complex problem—a problem for which no cointensive solutionis in sight.

Fig. 13 Redundance andirrelevance

EXAMPLE

A1 and A2 are irrelevant (uninformative) but not deletable

A2 is redundant (deletable)

0 A1

A1

A2

A2

0

D: black or white

D: black or white

Page 27: Forging New Frontiers: Fuzzy Pioneers I

14 L. A. Zadeh

5 Concluding Remark

Much of the information which resides in the Web—and especially in the domainof world knowledge—is imprecise, uncertain and partially true. Existing bivalent-logic-based methods of knowledge representation and deduction are of limited ef-fectiveness in dealing with information which is imprecise or partially true. To dealwith such information, bivalence must be abandoned and new tools, such as PNL,should be employed. What is quite obvious is that, given the deeply entrenchedtradition of basing scientific theories on bivalent logic, a call for abandonment ofbivalence is not likely to meet a warm response. Abandonment of bivalence willeventually become a reality but it will be a gradual process.

References

1. Arjona, J.; Corchuelo, R.; Pena, J. and Ruiz, D. 2003. Coping with Web Knowledge. Advancesin Web Intelligence. Springer-Verlag Berlin Heidelberg, 165–178.

2. Bargiela, A. and Pedrycz, W. 2003. Granular Computing—An Introduction. Kluwer AcademicPublishers: Boston, Dordrecht, London.

3. Berners-Lee, T.; Hendler, J. and Lassila, O. 2001. The Semantic Web. Scientific American.4. Chen, P.P. 1983. Entity-Relationship Approach to Information Modeling and Analysis. North

Holland.5. Craven, M.; DiPasquo, D.; Freitag, D.; McCallum, A.; Mitchell, T.; Nigam, K. and Slattery, S.

2000. Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelli-gence 118 (1-2): 69–113.

6. Cresswell, M. J. 1973. Logic and Languages. London, U.K.: Methuen.7. Gamat, T. F. 1996. Language, Logic and Linguistics. University of Chicago Press.8. Lenat, D. B. 1995.cyc: A Large-Scale Investment in Knowledge Infrastructure. Communica-

tions of the ACM 38(11): 32–38.9. Novak, V. and Perfilieva, I., eds. 2000. Discovering the World with Fuzzy Logic. Studies in

Fuzziness and Soft Computing. Heidelberg New York: Physica-Verlag.10. Pawlak, Z. 1991. Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Aca-

demic, Dordrecht.11. Pedrycz, W., and Gomide, F. 1998. Introduction to Fuzzy Sets. Cambridge, Mass.: MIT Press.12. Smith, B. and Welty, C. 2002. What is Ontology? Ontology: Towards a new synthesis.

Proceedings of the Second International Conference on Formal Ontology in InformationSystems.

13. Smith, M. K.; Welty, C.; McGuinness, D., eds. 2003. OWL Web Ontology Language Guide.W3C Working Draft 31.

14. Sowa, J. F., in Albertazzi, L. 1999. Ontological Categories. Shapes of Forms: From GestaltPsychology and Phenomenology to Ontology and Mathematics, Dordrecht: Kluwer AcademicPublishers, 307–340.

15. Zadeh, L.A. 2001. A New Direction in AI—Toward a Computational Theory of Perceptions.AI Magazine 22(1): 73–84.

16. Zadeh, L.A. 1999. From Computing with Numbers to Computing with Words—From Manip-ulation of Measurements to Manipulation of Perceptions. IEEE Transactions on Circuits andSystems 45(1): 105–119.

17. Zadeh, L.A. 1994. Fuzzy Logic, Neural Networks, and Soft Computing, Communications ofthe ACM—AI, Vol. 37, pp. 77–84.

Page 28: Forging New Frontiers: Fuzzy Pioneers I

Web Intelligence, World Knowledge and Fuzzy Logic 15

18. Zadeh, L.A. 1976. A fuzzy-algorithmic approach to the definition of complex or impreciseconcepts, Int. Jour. Man-Machine Studies 8, 249–291.

19. Zadeh, L.A. 1975. The concept of a linguistic variable and its application to approximatereasoning, Part I: Inf. Sci.8, 199–249, 1975; Part II: Inf. Sci. 8, 301-357, 1975; Part III: Inf. Sci.9, 43–80.

Page 29: Forging New Frontiers: Fuzzy Pioneers I

Towards Human Consistent LinguisticSummarization of Time Series via Computingwith Words and Perceptions

Janusz Kacprzyk, Anna Wilbik and Sławomir Zadrozny

Abstract We present a new, human consistent approach to the summarization ofnumeric time series aimed at capturing the very essence of what occurs in the datain terms of an increase, decrease and stable values of data, when relevant changesoccur, and how long are periods of particular behavior types exemplified by an in-crease, decrease and a stable period, all with some intensity. We use natural languagedescriptions that are derived by using tools of Zadeh’s computing with words andperceptions paradigm, notably tools and techniques involving fuzzy linguistic quan-tifiers. More specifically, we use the idea of fuzzy logic based linguistic summariesof data(bases) in the sense of Yager, later developed by Kacprzyk and Yager, andKacprzyk, Yager and Zadrozny. We extend that idea to a dynamic setting, to thesummarization of trends in time series characterized by: dynamics of change, dura-tion and variability. For the aggregation of partial trends, which is crucial element inthe problem considered, we apply the traditional Zadeh’s fuzzy logic based calculusof linguistically quantified propositions and the Sugeno integral. Results obtainedare promising.

1 Introduction

The availability of information is a prerequisite for coping with difficult, changeableand competitive environments within which individuals, human groups of varioussize, and organizations are to operate. To be effective and efficient, information

Janusz Kacprzyk · Sławomir ZadroznySystems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, PolandWarsaw School of Information Technology (WIT), ul. Newelska 6, 01-447 Warsaw, Polande-mail: [email protected]: zadrozny@@ibspan.waw.pl

Anna WilbikSystems Research Institute, Polish Academy of Sciences, ul. Newelska 6, 01-447 Warsaw, Polande-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 17C© Springer 2007

Page 30: Forging New Frontiers: Fuzzy Pioneers I

18 J. Kacprzyk et al.

should be not only easily available but also in a form that can be comprehensibleby individuals, groups or organizations. This means first of all that it should bein a proper language. In our context we have human beings as individuals, humangroups and human organizations. Unfortunately, this brings about serious difficul-ties because with computer systems being indispensable for supporting these tasksrelated to the acquisition, processing and presentation of information, and with ahuman being as a crucial element, there is an inherent discrepancy between the veryspecifics of them. Notably, this is mainly related to a remarkable human ability toproperly understand imprecise information resulting from the use of natural lan-guage, and to use it purposefully. This is clearly not the case for computers andinformation technology.

A real challenge is to attain a synergistic collaboration between a human beingand computer, taking advantage of the strengths of both of them. This is the basicphilosophy of our approach. We are concerned with some time distributed sequencesof numerical data, representing how a quantity of interest evolves over time. A hu-man being has a remarkable ability to capture the essence of how this evolves interms of duration, variability, etc. However, when asked to articulate or assess this,a human being is normally able to only provide an imprecise and vague statementin a natural language, exemplified by “values grow strongly in the beginning, thenstabilize, and finally decrease sharply”.

The traditional approach to such a trend analysis is based on the detection ofchange of the first and second derivative (e.g., [8, 12, 20, 21]). Some first attemptsat using nontraditional, natural language based analyses appeared in the works byBatyrshin [5], and Batyrshin et al. [3], [4]. Our approach, though maybe somewhatsimilar in spirit, is different with respect to the point of departure and tools andtechniques employed even if both are heavily based on the philosophy of Zadeh’scomputing with words.

In this paper we employ linguistic summaries of databases introduced by Yager[25], and then further developed by Kacprzyk and Yager [14], and Kacprzyk, Yagerand Zadrozny [15]). They serve a similar purpose. They are meant to capture in asimple natural language like form essential features of original data according to theuser’s needs or interests. Therefore some attempts have been made to apply them totemporal data, too (cf., e.g., [7]) but that approach is different from the one proposedin this paper.

Basically, our approach boils down to the extraction of (partial) trends whichexpress whether over a particular time span (window), usually small in relation tothe total time horizon of interest, data increase, decrease or are stable, all this withsome intensity. We use for this purpose a modification of the well known Sklanskyand Gonzalez [23] algorithm that is simple and efficient.

Then, those consecutive partial trends are to be aggregated to obtain a linguisticsummary of trends in a particular time series. For instance, in case of stock exchangedata, a linguistic summary of the type considered may be: “initially, prices of moststocks increased sharply, then stabilized, and finally prices of almost all stocks ex-hibited a strong variability”. One can use various methods of aggregation and herewe will use first the classic one based on Zadeh’s [26] calculus of linguisticallyquantified propositions. Then, we will use the Sugeno [24] integral.

Page 31: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 19

2 Temporal Data and Trend Analysis

We deal with numerical data that vary over time, and a time series is a sequenceof data measured at uniformly spaced time moments. We will identify trends aslinearly increasing, stable or decreasing functions, and therefore represent giventime series data as piecewise linear functions. Evidently, the intensity of an increaseand decrease (slope) will matter, too. These are clearly partial trends as a globaltrend in a time series concerns the entire time span of the time series, and there alsomay be trends that concern parts of the entire time span, but more than a particularwindow taken into account while extracting partial trends by using the Sklansky andGonzalez algorithm.

In particular, we use the concept of a uniform partially linear approximation ofa time series. Function f is a uniform ε-approximation of a time series, or a set ofpoints {(xi , yi )}, if for a given, context dependent ε > 0, there holds

∀i : | f (xi)− yi | ≤ ε (1)

We use a modification of the well known, effective and efficient Sklansky andGonzalez [23] algorithm that finds a linear uniform ε-approximation for subsetsof points of a time series. The algorithm constructs the intersection of conesstarting from point pi of the time series and including the circle of radius ε

around the subsequent data points pi+ j , j = 1, 2, . . . , until the intersectionof all cones starting at pi is empty. If for pi+k the intersection is empty, thenwe construct a new cone starting at pi+k−1. Figures 1 and 2 present the ideaof the algorithm. The family of possible solutions is indicated as a gray area.Clearly other algorithms can also be used, and there is a lot of them in theliterature.

To present details of the algorithm, let us first denote:

• p_0 – a point starting the current cone,• p_1 – the last point checked in the current cone,• p_2 – the next point to be checked,

Fig. 1 An illustration of thealgorithm for the uniformε-approximation – theintersection of the cones isindicated by the dark greyarea

y

x

β1 β2γ1 γ2p0

p1 •

•p2

Page 32: Forging New Frontiers: Fuzzy Pioneers I

20 J. Kacprzyk et al.

Fig. 2 An illustration of thealgorithm for the uniformε-approximation – a newcone starts in point p2

y

x

p0

•p2

• Alpha_01– a pair of angles (γ1, β1), meant as an interval, that defines the currentcone as shown in Fig. 1,

• Alpha_02 – a pair of angles of the cone starting at the point p_0 and inscribingthe circle of radius ε around the point p_2 (cf. (γ2, β2) in Fig. 1),

• function read_point() reads a next point of data series,• function find() finds a pair of angles of the cone starting at the point p_0 and

inscribing the circle of radius ε around the point p_2.

The pseudocode of the procedure that extracts the trends is depicted in Fig. 3.The bounding values of Alpha_02 (γ2, β2), computed by function find() cor-

respond to the slopes of two lines such that:

• are tangent to the circle of radius ε around point p2 = (x2, y2)

• start at the point p0 = (x0, y0)

Thus

γ2 = arctg

⎛⎝Δx ·Δy − ε

√(Δx)2 + (Δy)2 − ε2

(Δx)2 − ε2

⎞⎠

and

β2 = arctg

⎛⎝Δx ·Δy + ε

√(Δx)2 + (Δy)2 − ε2

(Δx)2 − ε20

⎞⎠

where Δx = x0 − x2 and Δy = y0 − y2.The resulting ε-approximation of a group of points p_0, . . . ,p_1 is either a

single segment, chosen as, e.g., a bisector, or one that minimizes the distance (e.g.,

Page 33: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 21

read_point(p_0); read_point(p_1); while(1) {p_2=p_1;Alpha_02=find();Alpha_01=Alpha_02;do{

Alpha_01 = Alpha_01 ∩ Alpha_02;

p_1=p_2;read_point(p_2);Alpha_02=find();

} while(Alpha_01 ∩ Alpha_02 �= 0);

save_found_trend();p_0=p_1;p_1=p_2;

}

Fig. 3 Pseudocode of the modified Sklansky and Gonzalez [23] procedure for extracting trends

the sum of squared errors, SSE) from the approximated points, or the whole familyof possible solutions, i.e., the rays of the cone.

This method is effective and efficient as it requires only a single pass through thedata. Now we will identify (partial) trends with the line segments of the constructedpiecewise linear function.

3 Dynamic Characteristics of Trends

In our approach, while summarizing trends in time series data, we consider thefollowing three aspects:

• dynamics of change,• duration, and• variability,

and it should be noted that by trends we mean here global trends, concerning theentire time series (or some, probably large, part of it), not partial trends concerninga small time span (window) taken into account in the (partial) trend extraction phasevia the Sklansky and Gonzales [23] algorithm.

In what follows we will briefly discuss these factors.

Page 34: Forging New Frontiers: Fuzzy Pioneers I

22 J. Kacprzyk et al.

3.1 Dynamics of Change

By dynamics of change we understand the speed of changes. It can be describedby the slope of a line representing the trend, (cf. α in Fig. 1). Thus, to quantifydynamics of change we may use the interval of possible angles α ∈ 〈−90; 90〉 ortheir trigonometrical transformation.

However it might be impractical to use such a scale directly while describingtrends. Therefore we may use a fuzzy granulation in order to meet the users’ needsand task specificity. The user may construct a scale of linguistic terms correspondingto various directions of a trend line as, e.g.:

• quickly decreasing,• decreasing,• slowly decreasing,• constant,• slowly increasing,• increasing,• quickly increasing

Figure 4 illustrates the lines corresponding to the particular linguistic terms.In fact, each term represents a fuzzy granule of directions. In Batyrshin et al.

[1, 4] there are presented many methods of constructing such a fuzzy granulation.The user may define a membership functions of particular linguistic terms depend-ing on his or her needs.

Fig. 4 A visualrepresentation of anglegranules defining thedynamics of change

constant

quicklyincreasing

quicklydecreasing

increasing

decreasing

slowly

increasing

slowly

decreasing

Page 35: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 23

We map a single value α (or the whole interval of angles corresponding to thegray area in Fig. 2) characterizing the dynamics of change of a trend identified usingthe algorithm shown as a pseudocode in Fig. 3 into a fuzzy set (linguistic label) bestmatching a given angle. We can use, for instance, some measure of a distance orsimilarity, cf. the book by Cross and Sudkamp’s [9]. Then we say that a given trendis, e.g., “decreasing to a degree 0.8”, if μdecreasing(α) = 0.8, where μdecreasing

is the membership function of a fuzzy set representing “decreasing” that is a bestmatch for angle α.

3.2 Duration

Duration describes the length of a single trend, meant as a linguistic variable andexemplified by a “long trend” defined as a fuzzy set whose membership functionmay be as in Fig. 5 where OX is the time axis divided into appropriate units.

The definitions of linguistic terms describing the duration depend clearly on theperspective or purpose assumed by the user.

3.3 Variability

Variability refers to how “spread out” (“vertically”, in the sense of values taken on)a group of data is. The following five statistical measures of variability are widelyused in traditional analyses:

• The range (maximum – minimum). Although the range is computationally theeasiest measure of variability, it is not widely used, as it is based on only twodata points that are extreme. This make it very vulnerable to outliers and thereforemay not adequately describe the true variability.

• The interquartile range (IQR) calculated as the third quartile (the third quartile isthe 75th percentile) minus the first quartile (the first quartile is the 25th percentile)that may be interpreted as representing the middle 50% of the data. It is resistantto outliers and is computationally as easy as the range.

• The variance is calculated as∑

i (xi−x)2

n , where x is the mean value.• The standard deviation – a square root of the variance. Both the variance and the

standard deviation are affected by extreme values.

Fig. 5 Example ofmembership functiondescribing the term “long”concerning the trend duration

µ(t)

1

t

Page 36: Forging New Frontiers: Fuzzy Pioneers I

24 J. Kacprzyk et al.

• The mean absolute deviation (MAD), calculated as∑

i |xi−x|n . It is not frequently

encountered in mathematical statistics. This is essentially because while the meandeviation has a natural intuitive definition as the “mean deviation from the mean”but the introduction of the absolute value makes analytical calculations using thisstatistic much more complicated.

We propose to measure the variability of a trend as the distance of the datapoints covered by this trend from a linear uniform ε-approximation (cf. Sect.2) that represents a given trend. For this purpose we propose to employ a dis-tance between a point and a family of possible solutions, indicated as a graycone in Fig. 1. Equation (1) assures that the distance is definitely smaller thanε. We may use this information for the normalization. The normalized distanceequals 0 if the point lays in the gray area. In the opposite case it is equal to thedistance to the nearest point belonging to the cone, divided by ε. Alternatively,we may bisect the cone and then compute the distance between the point andthis ray.

Similarly as in the case of dynamics of change, we find for a given value ofvariability obtained as above a best matching fuzzy set (linguistic label) using, e.g.,some measure of a distance or similarity, cf. the book by Cross and Sudkamp [9].Again the measure of variability is treated as a linguistic variable and expressedusing linguistic terms (labels) modeled by fuzzy sets defined by the user.

4 Linguistic Data Summaries

A linguistic summary is meant as a (usually short) natural language like sentence(or some sentences) that subsumes the very essence of a set of data (cf. Kacprzykand Zadrozny [18], [19]). This data set is numeric and usually large, not comprehen-sible in its original form by the human being. In Yager’s approach (cf. Yager [25],Kacprzyk and Yager [14], and Kacprzyk, Yager and Zadrozny [15]) the followingperspective for linguistic data summaries is assumed:

• Y = {y1, . . . , yn} is a set of objects (records) in a database, e.g., the set ofworkers;

• A = {A1, . . . , Am} is a set of attributes characterizing objects from Y , e.g.,salary, age, etc. in a database of workers, and A j (yi ) denotes a value of attributeA j for object yi .

A linguistic summary of a data set consists of:

• a summarizer P , i.e. an attribute together with a linguistic value (fuzzy predicate)defined on the domain of attribute A j (e.g. “low salary” for attribute “salary”);

• a quantity in agreement Q, i.e. a linguistic quantifier (e.g. most);

Page 37: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 25

• truth (validity) T of the summary, i.e. a number from the interval [0, 1] assessingthe truth (validity) of the summary (e.g. 0.7); usually, only summaries with a highvalue of T are interesting;

• optionally, a qualifier R, i.e. another attribute together with a linguistic value(fuzzy predicate) defined on the domain of attribute Ak determining a (fuzzysubset) of Y (e.g. “young” for attribute “age”).

Thus, a linguistic summary may be exemplified by

T (most of employees earn low salary) = 0.7 (2)

or, in a richer (extended) form, including a qualifier (e.g. young), by

T (most of young employees earn low salary) = 0.9 (3)

Thus, basically, the core of a linguistic summary is a linguistically quantifiedproposition in the sense of Zadeh [26] which, for (2), may be written as

Qy’s are P (4)

and for (3), may be written as

Q Ry’s are P (5)

Then, T , i.e., the truth (validity) of a linguistic summary, directly corresponds tothe truth value of (4) or (5). This may be calculated by using either original Zadeh’scalculus of linguistically quantified propositions (cf. [26]), or other interpretationsof linguistic quantifiers.

5 Protoforms of Linguistic Trend Summaries

It was shown by Kacprzyk and Zadrozny [18] that Zadeh’s [27] concept of theprotoform is convenient for dealing with linguistic summaries. This approach isalso employed here.

Basically, a protoform is defined as a more or less abstract prototype (template) ofa linguistically quantified proposition. Then, the summaries mentioned above mightbe represented by two types of the protoforms:

• Summaries based on frequency:

– a protoform of a short form of linguistic summaries:

Q trends are P (6)

Page 38: Forging New Frontiers: Fuzzy Pioneers I

26 J. Kacprzyk et al.

and exemplified by:

Most of trends have a large variability

– a protoform of an extended form of linguistic summaries:

Q R trends are P (7)

and exemplified by:

Most of slowly decreasing trends have a large variability

• Duration based summaries:

– a protoform of a short form of linguistic summaries:

The trends that took Q time are P (8)

and exemplified by:

The trends that took most time have a large variability

– a protoform of an extended form of linguistic summaries:

R trends that took Q time are P (9)

and exemplified by:

Slowly decreasing trends that took most time have a large variability

The truth values of the above types and forms of linguistic summaries will befound using the classic Zadeh’s [26] calculus of linguistically quantified proposi-tions, and Sugeno [24] integral.

5.1 The Use of Zadeh’s Calculus of Linguistically QuantifiedPropositions

In Zadeh’s [26] fuzzy logic based calculus of linguistically quantified propositions,a (proportional, nondecreasing) linguistic quantifier Q is assumed to be a fuzzy setin the interval [0, 1] as, e.g.

Page 39: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 27

μQ(x) =⎧⎨⎩

1 for x ≤ 0.82x − 0.6 for 0.3 < x < 0.80 for x ≤ 0.3

(10)

The truth values (from [0,1]) of (6) and (7) are calculated, respectively, as

T (Qy’s are P) = μQ

(1

n

n∑i=1

μP(yi )

)(11)

and

T (Q Ry’s are P) = μQ

(∑ni=1(μR(yi) ∧ μP (yi ))∑n

i=1 μR(yi)

)(12)

Here and later on, we assume, for obvious reasons, that the respective denomina-tors are not equal to zero. Otherwise we need to resort to slightly modlified formulas,but this will not be considered in this paper.

The calculation of the truth values of duration based summaries is more compli-cated and requires a different approach. In the case of a summary “the trends thattook Q time are P” we should compute the time that is taken by “trend is P”. It isobvious, when “trend is P” to degree 1, as we can use then the whole time taken bythis trend. However, what to do if “trend is P” is to some degree? We propose totake only a part of the time, defined by the degree to which “trend is P”. In otherwords we compute this time as μ(yi)tyi , where tyi is the duration of trend yi . Thevalue obtained (duration of those trends, which “trend is P”) is then normalized bydividing it by the overall time T . Finally, we may compute to which degree the timetaken by those trends which “trend is P” is Q.

We proceed in a similar way in case of the extended form of linguistic summaries.Thus, we obtain the following formulae:

• for the short form of the duration based summaries (8):

T (y that took Q time are P) = μQ

(1

T

n∑i=1

μP(yi )tyi

)(13)

where T is the total time of the summarized trends and tyi is the duration of thei th trend;

• for the extended form of the duration based summaries (9):

T (Ry that took Q time are P) == μQ

(∑ni=1(μR(yi ) ∧ μP(yi ))tyi∑n

i=1 μR(yi )tyi

)(14)

where tyi is the duration of the i th trend.

Page 40: Forging New Frontiers: Fuzzy Pioneers I

28 J. Kacprzyk et al.

Both the fuzzy predicates P and R are assumed above to be of a rather simplified,atomic form referring to just one attribute. They can be extended to cover moresophisticated summaries involving some confluence of various, multiple attributevalues as, e.g, “slowly decreasing and short”.

Alternatively, we may obtain the truth values of (8) and (9), if we divide eachtrend, which takes tyi time units, into tyi trends, each lasting one time unit. For thisnew set of trends we use frequency based summaries with the truth values definedin (11) and (12).

5.2 Use of the Sugeno Integral

The use of the Sugeno integral is particularly justified for the duration based lin-guistic summaries for which the use of Zadeh’s method is not straightforward.

Let us briefly recall the idea of the Sugeno [24] integral.Let X = {x1, . . . , xn} be a finite set. Then, cf., e.g., Sugeno [24], a fuzzy measure

on X is a set function μ : P(X) −→ [0,1] such that:

μ(∅) = 0, μ(X) = 1if A ⊆ B then μ(A) ≤ μ(B),∀A, B ∈ P(X)

(15)

where P(X) denotes a set of all subsets of X .Let μ(.) is a fuzzy measure on X . The discrete Sugeno integral of function f :

X −→ [0, 1], f (xi) = ai , with respect to μ(.) is a function Sμ : [0, 1]n −→ [0, 1]such that

Sμ(a1, . . . , an) = maxi=1,...,n

(aσ(i) ∧ μ(Bi)) (16)

where ∧ stands for the minimum, σ is such a permutation of {1, . . . , n} that aσ(i) isthe i -th smallest element from among the ai ’s and Bi = {xσ(i), . . . , xσ(n)}.

We can view function f as a membership function of a fuzzy set F ∈ F(X),where F(X) denotes a family of fuzzy sets defined in X . Then, the Sugeno integralcan be equivalently defined as a function Sμ : F(X) −→ [0, 1] such that

Sμ(F) = maxαi∈{a1,...,an}

(αi ∧ μ(Fαi )) (17)

where Fαi is the α-cut of F and the meaning of other symbols is as in (16).The fuzzy measure and the Sugeno integral may be intuitively interpreted in the

context of multicriteria decision making (MCDM) where we have a set of criteriaand some options (decisions) characterized by the degree of satisfaction of particularcriteria. In such a setting X is a set of criteria and μ(.) expresses the importance ofeach subset of criteria, i.e., how the satisfaction of a given subset of criteria con-tributes to the overall evaluation of the option. Then, the properties of the fuzzymeasure (15) properly require that the satisfaction of all criteria makes an optionfully satisfactory and that the more criteria are satisfied by an option the better

Page 41: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 29

its overall evaluation. Finally, the set F represents an option and μF (x) definesthe degree to which it satisfies the criterion x . Then, the Sugeno integral may beinterpreted as an aggregation operator yielding an overall evaluation of option F interms of its satisfaction of the set of criteria X . In such a context the formula (17)may be interpreted as follows:

find a subset of criteria of the highest possible importance (ex-pressed by μ) such that at the same time minimal satisfaction de-gree of all these criteria by the option F is as high as possible(expressed by α), and take the minimum of these two degrees asthe overall evaluation of the option F . (18)

Now we will briefly show how various linguistic summaries discussed in theprevious section may be interpreted using the Sugeno integral. The monotone andnondecreasing linguistic quantifier Q is still defined as in Zadeh’s calculus as afuzzy set in [0, 1], exemplified by (10).

The truth value of particular summaries is computed using the Sugeno integral(17). For simple types of summaries we are in a position to provide the interpretationsimilar to this given above for the MCDM. For this purpose we will identify the setof criteria X with a set of (partial) trends while option F will be the whole timeseries under consideration characterized in terms of how well its trends satisfy P .

In what follows |A| denotes the cardinality of set A, summarizers P and qualifiersR are identified with fuzzy sets modelling the linguistic terms they contain, X is theset of all trends extracted from time series and time (xi ) denotes duration of thetrend xi . Now, for the particular types of summaries we obtain:

• For simple frequency based summaries defined by (6)

The truth value may be expressed as Sμ(P) where

μ(Pα) = μQ

( |Pα||X |

)(19)

Thus, referring to (18), the truth value is determined by looking for a subset oftrends of the cardinality high enough as required by the semantics of the quanti-fier Q and such that all these trends “are P” to the highest possible degree.

• For extended frequency based summaries defined by (7)

The truth value of this type of summary may be expressed as Sμ(P) where

μ(Pα) = μQ

( | (P ∩ R)α || Rα |

)(20)

• For simple duration based summaries defined with (8):

The truth value of this type of summary may be expressed as Sμ(P) where

Page 42: Forging New Frontiers: Fuzzy Pioneers I

30 J. Kacprzyk et al.

μ(Pα) = μQ

(∑i:xi∈Pα

time(xi )∑i:xi∈X time(xi)

)(21)

Thus, referring to (18) the truth value is determined by looking for a subset oftrends such that their total duration with respect to the duration of the whole timeseries is long enough as required by the semantics of the quantifier Q and suchthat all these trends “are P” to the highest possible degree.

• For extended duration based summaries defined with (9):

The truth of this type of summary may be expressed as Sμ(P) where

μ(Pα) = μQ

(∑i:xi∈(P∩R)α

time(xi)∑i:xi∈Rα

time(xi )

)(22)

Due to the properties of the quantifiers employed it is obvious that all μ(.)s de-fined above for the particular types of summaries satisfy the axioms (15) of the fuzzymeasure.

This concludes our brief exposition of the use of the classic Zadeh’s calculus oflinguistically quantified propositions and the Sugeno integral for linguistic quanti-fier driven aggregation of partial scores (trends). In particular the former method isexplicitly related to the computing with words and perception paradigm.

6 Example

Assume that from some given data we have extracted the trends listed in Table 1using the method presented in this paper. We assume the granulation of dynamicsof change given in Sect. 3.1.

We will show the results obtained by using for the calculation of the truth (valid-ity) values first the classic Zadeh’s calculus of linguistically quantified propositions,and then the Sugeno integral.

Table 1 An example of trends extracted

dynamics of change duration variabilityid (α, in degrees) (time units) ([0,1])

1 25 15 0.22 –45 1 0.33 75 2 0.84 –40 1 0.15 –55 1 0.76 50 2 0.37 –52 1 0.58 –37 2 0.99 15 5 0.0

Page 43: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 31

6.1 Use of Zadeh’s Calculus of Linguistically QuantifiedPropositions

First, we consider the following frequency based summary:

Most of trends are decreasing (23)

In this summary most is the linguistic quantifier Q defined by (10). “Trends aredecreasing” is the summarizer P with decreasing given as:

μP(α) =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

0 for α ≤ −650.066α + 4.333 for − 65 < α < −501 for − 50 ≤ α ≤ −40−0.01α − 1 for − 40 < α < −200 for α ≥ −20

(24)

The truth value of (23) is calculated via (11), and we obtain:

T (Most of the trends are decreasing)=

= μQ

(1

n

n∑i=1

μP(yi )

)= 0.389

Where n is the number of all trends, here n = 9If we consider the extended form of the linguistic summary, then we may have,

for instance:

Most of short trends are decreasing (25)

Again, most is the linguistic quantifier Q given by (10). “Trends are decreasing”is the summarizer P as in the previous example. “Trend is short” is the qualifier Rwith μR(t) defined as:

μR(t) =⎧⎨⎩

1 for t ≤ 1− 1

2 t + 32 for 1 < t < 3

0 for t ≥ 3(26)

The truth value of (25) is:

T (Most of short trends are decreasing)== μQ

(∑ni=1(μR(yi ) ∧ μP (yi))∑n

i=1 μR(yi )

)= 0.892

Page 44: Forging New Frontiers: Fuzzy Pioneers I

32 J. Kacprzyk et al.

On the other hand, we may have the following duration based linguistic sum-mary:

Trends that took most time are slowly increasing (27)

The linguistic quantifier Q is most defined as previously. “Trends are slowlyincreasing” is the summarizer P with μP (α) defined as:

μP(α) =

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

0 for α ≤ 50.1α − 0.5 for 5 < α < 151 for 15 ≤ α ≤ 20−0.05α + 2 for 20 < α < 400 for α ≥ 40

(28)

The truth value of (27) is then:

T (Trends that took most time are slowly increasing) =

= μQ

(1

T

n∑i=1

μP(yi )tyi

)= 0.48333

We may consider an extended form of a duration based summary, for instance:

Trends wi th low variabili ty that took most time

are slowly increasing (29)

Again, most is the linguistic quantifier Q and “trends are slowly increasing” isthe summarizer P defined as in the previous example. “Trends have low variability”is the qualifier R with μR(v) given as:

μR(v) =⎧⎨⎩

1 for v ≤ 0.2−5v + 2 for 0.2 < v < 0.40 for v ≥ 0.4

(30)

The truth value of (29) is calculated via (14), and we obtain:

T (Trends wi th low variabili ty that took most time

are slowly increasing) == μQ

(∑ni=1(μR(yi ) ∧ μP (yi ))tyi∑n

i=1 μR(yi )tyi

)= 0.8444

Page 45: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 33

6.2 Use of the Sugeno Integral

For the following simple frequency based summary:

Most of trends are decreasing (31)

with most as in (10), “Trends are decreasing” with the membership function ofdecreasing given as in (24), the truth value of (31) is calculated using (17) and (19)which yield:

T (Most of the trends are decreasing)== max

αi∈{a1,...,an}

(αi ∧ μQ

( |Pα||X |

))= 0.511

If we assume the extended form, we may have the following summary:

Most of short trends are decreasing (32)

Again, most is the linguistic quantifier Q given as (10). “Trends are decreasing”is the summarizer P given as previously. “Trend is short” is the qualifier R definedby (26).

The truth value of (32) is calculated using (17) and (20), and we obtain:

T (Most of short trends are decreasing)

= maxαi∈{a1,...,an}

(αi ∧ μQ

( | (P ∩ R)α || Rα |

))= 0.9

On the other hand, we may have the following simple duration based linguisticsummary:

Trends that took most time are slowly increasing (33)

where “Trends are slowly increasing” is the summarizer P with μP (α) defined asin (28) and the linguistic quantifier most is defined as previously.

The truth value of (33) is:

T (Trends that took most time are slowly increasing)

= maxαi∈{a1,...,an}

(αi ∧ μQ

(∑xi∈Pα

time(xi )∑i:xi∈X time(xi )

))= 0.733

Finally, we may consider an extended form of a duration based summary, hereexemplified by:

Trends wi th a low variabili ty that took most of

the time are slowly increasing (34)

Page 46: Forging New Frontiers: Fuzzy Pioneers I

34 J. Kacprzyk et al.

Again, most is the linguistic quantifier and “trends are slowly increasing” is thesummarizer P defined as previously. “Trends have a low variability” is the qualifierR with μR(v) given in (30).

The truth value of (34) is:

T (Trends wi th low variabili ty that took most of

the time are slowly increasing)

= maxαi∈{a1,...,an}

(αi ∧ μQ

(∑i:xi∈(P∩R)α

time(xi )∑i:xi∈Rα

time(xi )

))= 0.75

7 Concluding Remarks

In the paper we have proposed an effective and efficient method for linguistic sum-marization of time series. We use a modification of the Sklansky and Gonzalez [23]algorithm to extract from a given time series trends that are described by 3 basicattributes related to their dynamics of change, duration and variability. The linguis-tic summarization employed is in the sense of Yager. For the aggregation of partialtrends we used the classic Zadeh’s calculus of linguistically quantified propositions,and the Sugeno integral.

The approach proposed may be a further step towards a more human consistentnatural language base temporal data analysis. This may be viewed as to reflect theuse of the philosophy of Zadeh’s computing with words and perceptions.

References

1. I. Batyrshin (2002). On granular Derivatives and the solution of a Granular Initial Value Prob-lem. International Journal Applied Mathematics and Computer Science, 12, 403–410.

2. I. Batyrshin, J. Kacprzyk, L. Sheremetov, L., L.A. Zadeh, Eds. (2006). Perception-based DataMining and Decision Making in Economics and Finance. Springer, Heidelberg and New York.

3. I. Batyrshin, R. Herrera-Avelar, L. Sheremetov, A. Panova (2006). Moving Approxima-tion Transform and Local Trend Association on Time Series Data Bases. In: Batyr-shin, I.; Kacprzyk, J.; Sheremetov, L.; Zadeh, L.A. (Eds.): Perception-based Data Min-ing and Decision Making in Economics and Finance. Springer, Heidelberg and New York,pp. 53–79.

4. I. Batyrshin, L. Sheremetov (2006). Perception Based Functions in Qualitative Forecasting.In: Batyrshin, I.; Kacprzyk, J.; Sheremetov, L.; Zadeh, L.A. (Eds.): Perception-based DataMining and Decision Making in Economics and Finance. Springer, Heidelberg and New York,pp. 112–127.

5. I. Batyrshin, L. Sheremetov, R. Herrera-Avelar (2006). Perception Based Patterns in TimeSeries Data Mining. In Batyrshin, I.; Kacprzyk, J.; Sheremetov, L.; Zadeh, L.A. (Eds.):Perception-based Data Mining and Decision Making in Economics and Finance. Springer,Heidelberg and New York, pp. 80–111.

6. T. Calvo, G. Mayor, R. Mesiar, Eds. (2002). Aggregation Operators. New Trends and Appli-cations. Springer-Verlag, Heidelberg and New York.

Page 47: Forging New Frontiers: Fuzzy Pioneers I

Human Consistent Linguistic Summarization of Time Series 35

7. D.-A. Chiang, L. R. Chow, Y.-F. Wang (2000). Mining time series data by a fuzzy linguisticsummary system. Fuzzy sets and Systems, 112, 419–432.

8. J. Colomer, J. Melendez, J. L. de la Rosa, J. Augilar (1997). A qualitative/quantitative rep-resentation of signals for supervision of continuous systems. In Proceedings of the EuropeanControl Conference - ECC97, Brussels.

9. V. Cross, Th. Sudkamp (2002). Similarity and Compatibility in Fuzzy Set Theory: Assessmentand Applications. Springer-Verlag, Heidelberg and New York.

10. H. Feidas, T. Makrogiannis, E. Bora-Senta (2004). Trend analysis of air temperature timeseries in Greece and their relationship with circulation using surface and satellite data:1955–2001. Theoretical and Applied Climatology, 79, 185–208.

11. I. Glöckner (2006). Fuzzy Quantifiers. A Computational Theory. Springer, Heidelberg andNew York.

12. F. Hoeppner (2003). Knowledge Discovery from Sequential Data. Ph.D. dissertation, Braun-schweig University.

13. B.G. Hunt (2001). A description of persistent climatic anomalies in a 1000-year climaticmodel simulation. Climate Dynamics, 17, 717–733.

14. J. Kacprzyk and R.R. Yager (2001). Linguistic summaries of data using fuzzy logic. In Inter-national Journal of General Systems, 30, 33–154.

15. J. Kacprzyk, R.R. Yager and S. Zadrozny (2000). A fuzzy logic based approach to linguisticsummaries of databases. International Journal of Applied Mathematics and Computer Sci-ence, 10, 813–834.

16. J. Kacprzyk and S. Zadrozny (1995). FQUERY for Access: fuzzy querying for a Windows-based DBMS. In P. Bosc and J. Kacprzyk (Eds.), Fuzziness in Database Management Systems,Springer-Verlag, Heidelberg and New York, pp. 415–433.

17. J. Kacprzyk and S. Zadrozny (1999). The paradigm of computing with words in intelligentdatabase querying. In L.A. Zadeh and J. Kacprzyk (Eds.) Computing with Words in Informa-tion/Intelligent Systems. Part 2. Foundations, Springer–Verlag, Heidelberg and New York, pp.382–398.

18. J. Kacprzyk, S. Zadrozny (2005). Linguistic database summaries and their protoforms: towardnatural language based knowledge discovery tools. Information Sciences, 173, 281–304.

19. J. Kacprzyk, S. Zadrozny (2005). Fuzzy linguistic data summaries as a human consistent, useradaptable solution to data mining. In B. Gabrys, K. Leiviska, J. Strackeljan (Eds.), Do SmartAdaptive Systems Exist? Springer, Berlin, Heidelberg and New York, pp. 321–339.

20. S. Kivikunnas (1999). Overview of Process Trend Analysis Methods and Applications. InProceedings of Workshop on Applications in Chemical and Biochemical Industry, Aachen.

21. K. B. Konstantinov, T. Yoshida (1992). Real-time qualitative analysis of the temporal shapesof (bio) process variables. American Institute of Chemical Engineers Journal, 38, 1703–1715.

22. K. Neusser (1999). An investigation into a non-linear stochastic trend model. Empirical Eco-nomics 24, 135–153.

23. J. Sklansky and V. Gonzalez (1980). Fast polygonal approximation of digitized curves. PatternRecognition 12, 327–331.

24. M. Sugeno (1974). Theory of Fuzzy Integrals and Applications. Ph.D. Thesis, Tokyo Instituteof Technology, Tokyo, Japan.

25. R.R. Yager (1982). A new approach to the summarization of data. Information Sciences, 28,69–86.

26. L.A. Zadeh (1983). A computational approach to fuzzy quantifiers in natural languages. InComputers and Mathematics with Applications, 9, 149–184.

27. L.A. Zadeh (2002). A prototype-centered approach to adding deduction capabilities to searchengines – the concept of a protoform. BISC Seminar, University of California, Berkeley.

Page 48: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From IntelligentSystems and Computation to Human Mind

Masoud Nikravesh

Abstract Inspired by human’s remarkable capability to perform a wide variety ofphysical and mental tasks without any measurements and computations and dissat-isfied with classical logic as a tool for modeling human reasoning in an impreciseenvironment, Lotfi A. Zadeh developed the theory and foundation of fuzzy logicwith his 1965 paper “Fuzzy Sets” [1] and extended his work with his 2005 pa-per “Toward a Generalized Theory of Uncertainty (GTU)—An Outline” [2]. Fuzzylogic has at least two main sources over the past century. The first of these sourceswas initiated by Peirce in the form what he called a logic of vagueness in 1900s, andthe second source is Lotfi’s A. Zadeh work, fuzzy sets and fuzzy Logic in the 1960sand 1970s.

Keywords: Nature of Mind · Zadeh · Fuzzy Sets · Logic · Vagueness

1 Introduction

Human have a remarkable capability to perform a wide variety of physical andmental tasks without any measurements and any computations. In traditional sense,computation means manipulation of numbers, whereas human uses words for com-putation and reasoning. Underlying this capability is brain’s crucial ability tomanipulate perceptions and remarkable capability to operate on, and reason with,perception-based information which are intrinsically vague, imprecise and fuzzy. Inthis perspective, addition of the machinery of the computing with words to existingtheories may eventually lead to theories which have a superior capability to dealwith real-world problems and make it possible to conceive and design intelligentsystems or systems of system with a much higher MIQ (Machine IQ) than those wehave today. The role model for intelligent system is Human Mind. Dissatisfied with

Masoud NikraveshBISC Program, Computer Sciences Division, EECS Department University of California, Berke-ley, and NERSC-Lawrence Berkeley National Lab., California 94720–1776, USAe-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 37C© Springer 2007

Page 49: Forging New Frontiers: Fuzzy Pioneers I

38 M. Nikravesh

classical logic as a tool for modeling human reasoning in an imprecise and vagueenvironment, Lotfi A. Zadeh developed the formalization of fuzzy logic, startingwith his 1965 paper “Fuzzy Sets” [1] and extended his work with his 2005 paper“Toward a Generalized Theory of Uncertainty (GTU)—An Outline” [2].

To understand how human mind works and its relationship with intelligent sys-tem, fuzzy logic and Zadeh’s contribution, in this paper we will focus on the issuesrelated to common sense, historical sources of fuzzy logic, different historical pathswith respect to the nature of human mind, and the evolution of intelligent systemand fuzzy logic.

2 Common Sense

In this section, we will focus on issues related to common sense in the basis ofthe nature of human mind. We will also give a series of examples in this respect.Piero Scaruffi in his book “Thinking about Thought” [3], argues that human has aremarkable capability to confront the real problem in real situation by using a veryprimitive logic. In Lotfi A. Zadeh’s perspective, human have a remarkable capabilityto perform a wide variety of physical and mental tasks without any measurementsand any computations. This is the notion that we usually refer as “common sense”.This remarkable capability is less resembles to a mathematical genius but it is quiteeffective for the purpose of human survival in this world. Common sense deter-mines usually what we do, regardless of what we think. However, it is very impor-tant to mention that common sense is sometimes wrong. There are many examplesof “paradoxical” common sense reasoning in the history of science. For example,Zeno proved that Achilles could never overtake a turtle or one can easily prove thatGeneral Relativity is absurd (a twin that gets younger just by traveling very far iscertainly a paradox for common sense) and not to mention that common sense toldus that the Earth is flat and at the center of the world. However, physics is based onprecise Mathematics and not on common sense for the reason that common sensecan be often being wrong.

Now the question is that “What is this very primitive logics based on that humanuse for common sense?” Logic is based on deduction, a method of exact inferencebased on classical logic ( i.e. “all Greek are human and Socrates is a Greek, thereforeSocrates is a human”) or approximate in form of fuzzy logic. There are other typesof inference, such as induction which infers generalizations from a set of events (i.e.“Water boils at 100 degrees”), and abduction, which infers plausible causes of aneffect (i.e. “You have the symptoms of a flue”).

While the common sense is based on a set of primitive logics, human rarelyemploy classical logic to determine how to act in a new situation. If we used clas-sical logic, which for sure are paradoxical often, one may be able to take only afew actions a day. The classical logic allows human to reach a conclusion whena problem is well defined and formulated while common sense helps human todeal with the complexity of the real world which are ill defined, vague and arenor formulated. In short, common sense provides an effective path to make critical

Page 50: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 39

decisions very quickly based on perceptive information. The bases of common senseare both reasoning methods and human knowledge but that are quite distinct fromthe classical logic. It is extremely difficult, if not impossible, to build a mathematicalmodel for some of the simplest decisions we make daily. For example, human canuse common sense to draw conclusions when we have incomplete, imprecise, andunreliable or perceptive information such as “many”, “tall”, “red”, “almost”. Fur-thermore and most importantly, common sense does not have to deal with logicalparadoxes. What is important to mention is that the classical logic is inadequate forordinary life. The limits and inadequacies of classical logic have been known fordecades and numerous alternatives or improvements have been proposed includingthe use Intuitionism, Non- Monotonic Logic (Second thoughts), Plausible reasoning(Quick, efficient response to problems when an exact solution is not necessary), andfuzzy logic.

Another important aspect of common sense reasoning is the remarkable humancapability on dealing with uncertainties which are either implicit or explicit. Themost common classic tool for representing uncertainties is probability theory whichis formulated by Thomas Bayes in the late 18th century. The basis for the proba-bilities theory is to translate uncertainty into some form of statistics. In addition,other techniques such as Bayes’ theorem allow one to draw conclusions from anumber of probable events. In a technical framework, we usually use probabilityto express a belief. In the 1960s Leonard Savage [4] thought of the probability ofan event as not merely the frequency with which that event occurs, but also as ameasure of the degree to which someone believes it will happen. While Bayes rulewould be very useful to build generalizations, unfortunately, it requires the “prior”probability, which, in the case of induction, is precisely what we are trying to assess.Also, Bayes’ theorem often time does not yield intuitive conclusions such as theaccumulation of evidence tends to lower the probability, not to increase it or thesum of the probabilities of all possible events must be one, and that is also not veryintuitive. To solve some of the limitations of the probability theory and make itmore plausible, in 1968 Glenn Shafer [5] and Stuart Dempster devised a theoryof evidence by introducing a “belief function” which operates on all subsets ofevents (not just the single events). Dempster-Shafer’s theory allows one to assigna probability to a group of events, even if the probability of each single event isnot known. Indirectly, Dempster-Shafer’s theory also allows one to represent “ig-norance”, as the state in which the belief of an event is not known. In other words,Dempster-Shafer’s theory does not require a complete probabilistic model of thedomain. Zadeh introduced fuzzy set in 1965 and fuzzy logic in 1973 and generalizethe classical and bivalent logic to deal with common sense problems.

3 Historical Source of Fuzzy Logic

Fuzzy logic has at least two main sources over the past century. The first of thesesources was initiated by Charles S. Peirce who used the term “logic of vagueness”,could not finish the idea and to fully develop the theory prior to his death. In 1908,

Page 51: Forging New Frontiers: Fuzzy Pioneers I

40 M. Nikravesh

he has been able to outline and envision the theory of triadic logic. The concept of“vagueness” was later picked up by Max Black [6]. In 1923, philosophers BertrandRussell [7] in a paper on vagueness suggested that language is invariably vague andthat vagueness is a matter of degree. More recently, the logic of vagueness becamethe focus of studies of other such by Brock [8], Nadin [9, 10], Engel-Tiercelin [11],and Merrell [12, 13, 14, 15].

The second source is the one initiated by Lotfi A. Zadeh who used the term“Fuzzy Sets” for the first time in 1960 and extended the idea during the past 40 years.

3.1 Lotfi A. Zadeh [1]

Prior to the publication of his [Prof. Zadeh] first paper on fuzzy sets in l965—apaper which was written while he was serving as Chair of the Department of Elec-trical Engineering at UC Berkeley—he had achieved both national and internationalrecognition as the initiator of system theory, and as a loading contributor to whatis more broadly referred to as systems analysis. His principal contributions werethe development of a novel theory of time-varying networks; a major extension ofWiener’s theory of prediction (with J.R. Ragazzini) [16]; the z-transform methodfor the analysis of sampled-data systems (with J.R. Ragazzini) [17]; and most im-portantly, development of the state space theory of linear systems, published as aco-authored book with C.A. Desoer in l963 [18].

Publication of his first paper on fuzzy sets in l965 [1] marked the beginning ofa new phase of his scientific career. From l965 on, almost all of his publicationshave been focused on the development of fuzzy set theory, fuzzy logic and their ap-plications. His first paper entitled “Fuzzy Sets,” got a mixed reaction. His strongestsupporter was the late Professor Richard Bellman, an eminent mathematician and aleading contributor to systems analysis and control. For the most part, He encoun-tered skepticism and sometimes outright hostility. There were two principal reasons:the word “fuzzy” is usually used in a pejorative sense; and, more importantly, hisabandonment of classical, Aristotelian, bivalent logic was too radical as a departurefrom deep-seated scientific traditions. Fuzzy logic goes beyond Lukasiewicz’s [19]multi-valued logic because it allows for an infinite number of truth values: the degreeof “membership” can assume any value between zero and one. Fuzzy Logic is alsoconsistent with the principle of incompatibility stated at the beginning of the centuryby a father of modern Thermodynamics, Pierre Duhem.

The second phase, 1973–1999 [22, 23, 24, 25, 26], began with the publication of1973 paper, “Outline of a New Approach to the Analysis of Complex Systems andDecision Processes” [20]. Two key concepts were introduced in this paper: (a) theconcept of a linguistic variable; and (b) the concept of a fuzzy if-then rule. Today,almost all applications of fuzzy set theory and fuzzy logic involve the use of theseconcepts. What should be noted is that Zadeh employed the term “fuzzy logic” forthe first time in his 1974 Synthese paper “Fuzzy Logic and Approximate Reason-ing”. [21] Today, “fuzzy logic” is used in two different senses: (a) a narrow sense, inwhich fuzzy logic, abbreviated as FLn, is a logical system which is a generalization

Page 52: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 41

of multivalued logic; and (b) a wide sense, in which fuzzy logic, abbreviated as FL,is a union of FLn, fuzzy set theory, possibility theory, calculus of fuzzy if-then rules,fuzzy arithmetic, calculus of fuzzy quantifiers and related concepts and calculi. Thedistinguishing characteristic of FL is that in FL everything is, or is allowed to be, amatter of degree.

Many of Prof. Zadeh’s papers written in the eighties and early nineties wereconcerned, for the most part, with applications of fuzzy logic to knowledge rep-resentation and commonsense reasoning. Unlike Probability theory, Fuzzy Logicrepresents the real world without any need to assume the existence of randomness.

Soft computing came into existence in 1991, with the launching of BISC (Berke-ley Initiative in Soft Computing) at UC Berkeley. Basically, soft computing is acoalition of methodologies which collectively provide a foundation for conception,design and utilization of intelligent systems. The principal members of the coalitionare: fuzzy logic, neurocomputing, evolutionary computing, probabilistic computing,chaotic computing, rough set theory and machine learning. The basic tenet of softcomputing is that, in general, better results can be obtained through the use of con-stituent methodologies of soft computing in combination rather than in a stand-alonemode. A combination which has attained wide visibility and importance is that ofneuro-fuzzy systems. Other combinations, e.g., neuro-fuzzy-genetic systems, areappearing, and the impact of soft computing is growing on both theoretical andapplied levels.

3.2 Charles S. Peirce [26]

Charles S. Peirce is known to what he termed a logic of vagueness and equivalently,logic of possibility and logic of continuity. Pierce believes that this logic will fittheory of possibility which can be used for most of the reasoning case. Pierce’s viewindeed was a radical view in his time which obviously did go against the classicallogic such as Boole, de Morgan, Whatley, Schröder, and others. For a period oftime, researchers were believed that Peirce’s work is line with triadic logic; how-ever it must be more than that. In fact, one should assume, as Peirce himself oftentimes used term logic of vagueness, his theory and thinking most likely is one ofthe sources of today’s fuzzy logic. Unfortunately, Peirce has not been able to fullydevelop his idea and theory of logic of vagueness that he envisioned. In 1908, hehas been able to outline the makings of a ‘triadic logic’ and envision his logic basedon actuality real possibility and necessity.

4 Human Mind

Mind is what distinguishes humans from the rest of creation. Human mind is thenext unexplored frontier of science and arguably the most important for humankind.Our mind is who we are. We consider human brain as a machine or as a powerful

Page 53: Forging New Frontiers: Fuzzy Pioneers I

42 M. Nikravesh

Fig. 1 New Age

computer, the mind is a set of processes. Therefore mind occurs in the brain,however mind it is not like a computer program.

4.1 Different Historical Paths

It has been different historical part toward the understanding of the human mind,these include: philosophy, psychology, math, biology (neuro computing), com-puter science and AI (McCarthy, 1955’s), linguistics and computational linguistics(Chomsky 1960s), and physics (since 1980s) (Figs. 1–4). Each of these fields hascontributed in a different way to help scientist to better understand the human mind.For example the contribution of information science includes, 1) consider the mindas a symbol processor, 2) formal study of human knowledge, 3) the knowledgeprocessing, 4) study of common-sense knowledge, and 5) Neuro-computing. Thecontribution of linguistics includes competence over performance, pragmatics, andmetaphor. The contribution of psychology includes understanding the mind as aprocessor of concepts, reconstructive memory, memory is learning and is reasoning,and fundamental unity of cognition. The contribution of neurophysiology includesto understand the brain is an evolutionary system, mind shaped mainly by genesand experience, neural-level competition, and connectionism. Finally, the contribu-tion of physics includes, understanding the living beings create order from disorder,non-equilibrium thermodynamics, self-organizing systems, and the mind as a self-organizing system, theories of consciousness based on quantum & relativity physics.

Page 54: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 43

1936: TURING MACHINE1940: VON NEUMANN’S DISTINCTION BETWEEN DATA AND

INSTRUCTIONS1943: FIRST COMPUTER1943 MCCULLOUCH & PITTS NEURON1947: VON NEUMANN’s SELF-REPRODUCING AUTOMATA1948: WIENER’S CYBERNETICS1950: TURING’S TEST 1956: DARTMOUTH CONFERENCE ON ARTIFICIAL INTELLIGENCE1957: NEWELL & SIMON’S GENERAL PROBLEM SOLVER 1957: ROSENBLATT’S PERCEPTRON1958: SELFRIDGE’S PANDEMONIUM1957: CHOMSKY’S GRAMMAR1959: SAMUEL’S CHECKERS1960: PUTNAM’S COMPUTATIONAL FUNCTIONALISM1960: WIDROW’S ADALINE1965: FEIGENBAUM’S DENDRAL

MACHINE INTELLIGENCE: History

1965: ZADEH’S FUZZY LOGIC1966: WEIZENBAUM’S ELIZA1967: HAYES-ROTH’S HEARSAY1967: FILLMORE’S CASE FRAME GRAMMAR1969: MINSKY & PAPERT’S PAPER ON NEURAL NETWORKS1970: WOODS’ ATN1972: BUCHANAN’S MYCIN1972: WINOGRAD’S SHRDLU1974: MINSKY’S FRAME1975: SCHANK’S SCRIPT1975: HOLLAND’S GENETIC ALGORITHMS1979: CLANCEY’S GUIDON1980: SEARLE’S CHINESE ROOM ARTICLE1980: MCDERMOTT’S XCON1982: HOPFIELD’S NEURAL NET1986: RUMELHART & MCCLELLAND’S PDP1990: ZADEH’S SOFT COMPUTING2000: ZADEH’S COMPUTING WITH WORDS AND PERCEPTIONS & PNL

Fig. 2 History of Machine Intelligence

4.2 Mind and Computation

As we discussed earlier, Mind is not like a computer program. Mind is a setof processes that occurs in Brain. Mind has computational equivalences in manydimensions such as imaging and thinking. Other important dimensions that Mindhas computational equivalences includes, appreciation of beauty, attention, aware-ness, belief, cognition, consciousness, emotion, feeling, imagination, intelligence,knowledge, meaning, perception, planning, reason, sense of good and bad, senseof justice & duty, thought, understanding, and wisdom. Dr. James S. Albus SeniorNIST Fellow – Intelligent Systems Division–Manufacturing Engineering Labora-tory at National Institute of Standards and Technology provided a great overview ofthese dimensions and the computational equivalences of Mind as described below(with his permission):

• Understanding = correspondence between model and reality• Belief = level of confidence assigned to knowledge• Wisdom = ability to make decisions that achieve long term goals• Attention = focusing sensors & perception on what is important• Awareness = internal representation of external world

Page 55: Forging New Frontiers: Fuzzy Pioneers I

44 M. Nikravesh

DAVID HILBERT (1928)MATHEMATICS = BLIND MANIPULATION OF SYMBOLSFORMAL SYSTEM = A SET OF AXIOMS AND A SET OF INFERENCE RULESPROPOSITIONS AND PREDICATESDEDUCTION = EXACT REASONING

KURT GOEDEL (1931) • A CONCEPT OF TRUTH CANNOT BE DEFINED WITHIN A FORMAL SYSTEM

ALFRED TARSKI (1935)• DEFINITION OF “TRUTH”: A STATEMENT IS TRUE IF IT CORRESPONDS TO

REALITY (“CORRESPONDENCE THEORY OF TRUTH”)

ALFRED TARSKI (1935) BUILD MODELS OF THE WORLD WHICH YIELD INTERPRETATIONS OF

SENTENCES IN THAT WORLDTRUTH CAN ONLY BE RELATIVE TO SOMETHING“META-THEORY”

ALAN TURING (1936) • COMPUTATION = THE FORMAL MANIPULATION OF SYMBOLS

THROUGH THE APPLICATION OF FORMAL RULES• HILBERT’S PROGRAM REDUCED TO MANIPULATION OF SYMBOLS• LOGIC = SYMBOL PROCESSING• EACH PREDICATE IS DEFINED BY A FUNCTION, EACH FUNCTION IS

DEFINED BY AN ALGORITHM

NORBERT WIENER (1947) CYBERNETICSBRIDGE BETWEEN MACHINES AND NATURE, BETWEEN "ARTIFICIAL"

SYSTEMS AND NATURAL SYSTEMSFEEDBACK, HOMEOSTASIS, MESSAGE, NOISE, INFORMATION

PARADIGM SHIFT FROM THE WORLD OF CONTINUOUS LAWS TO THE

WORLD OF ALGORITHMS, DIGITAL VS ANALOG WORLD

CLAUDE SHANNON AND WARREN WEAVER (1949)• INFORMATION THEORYENTROPY = A MEASURE OF DISORDER = A MEASURE OF THE

LACK OF INFORMATION LEON BRILLOUIN'S NEGENTROPY PRINCIPLE OF INFORMATION

ANDREI KOLMOGOROV (1960) ALGORITHMIC INFORMATION THEORYCOMPLEXITY = QUANTITY OF INFORMATIONCAPACITY OF THE HUMAN BRAIN = 10 TO THE 15TH POWER MAXIMUM AMOUNT OF INFORMATION STORED IN A HUMAN BEING = 10 TO THE 45TH ENTROPY OF A HUMAN BEING = 10 TO THE 23TH

LOTFI A. ZADEH (1965) FUZZY SET“Stated informally, the essence of this principle is that as the complexity of a system increases, our ability to make precise and yet significant statements about its behavior diminishes until a threshold is reached beyond which precision and significance (or relevance) become almost mutually exclusive characteristics.”

Fig. 3 History of Computation Science

• Consciousness = aware of self & one’s relationship to world• Intelligence = ability to achieve goals despite uncertainty• Sense of good and bad = fundamental value judgment• Sense of justice & duty = culturally derived value judgment

Page 56: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 45

First AttemptsSimple neurons which are binary devices with fixed thresholds – simple logic functions like “and”, “or” – McCulloch and Pitts (1943)

Promising & Emerging TechnologyPerceptron – three layers network which can learn to connect or associate a given input to a random output - Rosenblatt (1958)ADALINE (ADAptive LInear Element) – an analogue electronic device which uses least-mean-squares (LMS) learning rule – Widrow & Hoff (1960)1962: Rosenblatt proved the convergence of the perceptron training rule.

Period of Frustration & DisreputeMinsky & Papert’s book in 1969 in which they generalised the limitations of single layer Perceptrons to multilayered systems. “...our intuitive judgment that the extension (to multilayer systems) is sterile”

InnovationGrossberg's (Steve Grossberg and Gail Carpenter in 1988) ART (Adaptive Resonance Theory) networks based on biologically plausible models. Anderson and Kohonen developed associative techniquesKlopf (A. Henry Klopf) in 1972, developed a basis for learning in artificial neurons based on a biological principle for neuronal learning called heterostasis. Werbos (Paul Werbos 1974) developed and used the back-propagation learning method1970-1985: Very little research on Neural NetsFukushima’s (F. Kunihiko) cognitron (a step wise trained multilayered neural network for interpretation of handwritten characters).

Re-Emergence1986: Invention of Backpropagation [Rumelhart and McClelland, but also Parker and earlier on: Werbos] which can learn from nonlinearly-separable data sets. Since 1985: A lot of research in Neural Nets!1983 Defense Advanced Research Projects Agency(DARPA) funded neurocomputingresearch

Fig. 4 History of Machine Intelligence (Neuro Computing)

• Appreciation of beauty = perceptual value judgment• Imagination = modeling, simulation, & visualization• Thought = analysis of what is imagined• Reason = logic applied to thinking• Planning = thinking about possible future actions and goals• Emotion = value judgment, evaluation of good and bad• Feeling = experience of sensory input or emotional state• Perception = transformation of sensation into knowledge• Knowledge = information organized so as to be useful• Cognition = analysis, evaluation, and use of knowledge• Meaning = relationships between forms of knowledge

4.3 The Nature of Mind

As we discussed earlier, it has been different historical part toward the understandingof the human mind. This understanding provides a better insight into the nature ofthe Mind. The main contributions of the Information sciences toward better under-standing of the nature of Mind include; development of neural network models,the model mind as a symbol processor, formal study of human knowledge andknowledge processing and common sense knowledge. The main contributions ofthe Linguistics include understanding of pragmatics, metaphor, and competence.Psychology has opened an important door toward better understanding the nature of

Page 57: Forging New Frontiers: Fuzzy Pioneers I

46 M. Nikravesh

the Mind through interpreting Mind as a processor of concepts, unity of cognition,memory as a learning and reasoning, and reconstructive memory. Neurophysiologyhelped to better understand the connection between brain and Mind which inter-preted the brain as an evolutionary system, and the fact that Mind is shaped andformed maily by genes and experience, and better insight about the nature of Mindat neural-level competition and as connectionism. Physics has been in front row ofunderstanding the nature of Mind and development of systems to represent suchunexplored frontiers. Physics provided another insight about the nature of Mindsuch as Mind as self-organizing systems and the theory of consciousness based onquantum and relativity physics.

4.4 Recent Breakthroughs

Recent years we have witnessed a series of breakthrough toward better understand-ing of the human Mind. These include breakthrough in Neurosciences, CognitiveModeling, Intelligent Control, Depth Imaging, and Computational Power.

Dr. James S. Albus Senior NIST Fellow summarized some of the new break-through as follows (with his permission):

Neurosciences – Focused on understanding the brain

- chemistry, synaptic transmission, axonal connectivity, functional MRI

Cognitive Modeling – Focused on representation and use of knowledge in per-forming cognitive tasks

- mathematics, logic, language

Intelligent Control – Focused on making machines behave appropriately (i.e.,achieve goals) in an uncertain environment

- manufacturing, autonomous vehicles, agriculture, mining

Depth Imaging – Enables geometrical modeling of 3-D world. Facilitatesgrouping and segmentation.

- Provides solution to symbol-grounding problem.

Computational Power – Enables processes that rival the brain in operationsper second. At 1010 ops, heading for 1015 ops.

4.5 Future Anticipations: Reality or SciFi

It is year 2020. What can we expect in year 2020. Scientist projects the followingpossible predictions and anticipations:

• Computing Power ==> Quadrillion/sec/$100

– 5–15 Quadrillion/sec (IBM’s Fastest computer = 100 Trillion)

Page 58: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 47

• High Resolution Imaging (Brian and Neuroscience)

– Human Brain, Reverse Engineering– Dynamic Neuron Level Imaging/Scanning and Visualization

• Searching, Logical Analysis Reasoning

– Searching for Next Google

• Technology goes Nano and Molecular Level

– Nanotechnology– Nano Wireless Devices and OS

• Tiny- blood-cell-size robots• Virtual Reality through controlling Brain Cell Signals

• Once we develop a very intelligent robot or may be humanoid that can do ev-erything human can do and may be better, faster and more efficient most of thetasks in corporate and labor world, at that time, who do you think should workand who should get paid?

Everyday, the boundary between the reality and fantasy, and science and fictionget fuzzier and fuzzier and it is hard to know if we are dreaming or we live in reality.Human has been dreamed for years and fantasized her dreams and today and may bein year 2020 or 2030 or 2050, we will see some of the long waited dreams becomereality. Figures 5 through 10 are some of my picks that connect the reality to theSciFi and what we can anticipate for years to come.

5302 si raey ehT

noitcif ecneics eht , a era scitoboR fo swaL eerhT yb nettirw selur eerht fo tes vomisA caasI ,

tsom hcihw stobor tsum noitcif sih ni gniraeppa yrots trohs 2491 sih ni decudortnI .yebo

" dnuoranuR :gniwollof eht etats swaL eht ,"

hguorht ,ro ,gnieb namuh .mrah ot emoc ot gnieb namuh a wolla ,noitcani

dluow sredro hcus erehw tpecxe sgnieb namuh .waL tsriF eht htiw tcilfnoc

sa gnol sa ,ecnetsixe nwo sti tcetorp tsum tobor A.3 ro tsriF eht htiw tcilfnoc ton seod noitcetorp hcus

.waL dnoceS

aidepolcycnE--aidepikiW

nI

a mrah ton yam tobor A.1

yb ti ot nevig sredro eht yebo tsum tobor A.2

ynnoS

Fig. 5 Movie I-Robot, the age of robot and machine intelligence

Page 59: Forging New Frontiers: Fuzzy Pioneers I

48 M. Nikravesh

sraW ratS laitneulfni na si ecneicsysatnaf agas dna esrevinu lanoitcif

yb detaerc retirw / recudorp / rotceridsacuL egroeG .s0791 eht gnirud

aidepolcycnE--aidepikiW

Fig. 6 Movie Star Wars, the age of human cloning

5 Quotations

“As Complexity increases precise statements lose meaning and meaningful state-ments lose precision.” (Lotfi A Zadeh)

“As far as the laws of mathematics refer to reality, they are not certain, and asthey are certain, they do not refer to reality.” (A. Einstein)

“The certainty that a proposition is true decreases with any increase of its pre-cision The power of a vague assertion rests in its being vague (“I am not tall”.

xulFnoÆ yb detaerc sawnaciremA naeroK

rotamina gnuhC reteP .

aidepolcycnE--aidepikiW

,roirp sraey 004 smees tI stenalp eht fo tser ehtot debmuccus noitalupop lairtsudni eugav a

saw taht )efaS’senyaH ddoT fo sedahs( esaesid tnerruc eht fo rotsecna tnatsid a yb deruc retal

detimilnu ylgnimees etipseD .namriahC snezitic stneverp esaesid fo raef ,ygolonhcetoiB

otni tuo gnirutnev morf dnoyeb yreneerg dliw eht gnivah si noitalupop seitic ehT .sllaws’angerB

poep dna smaerd egnarts .gniraeppasid era el

Fig. 7 Movie AeonFlux, the new age of communication and genetic cloning

Page 60: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 49

Steven Spielborg

emeht suoivbo tsom dna tsrif ehT era slaudividni rehtehw si

yeht rehtehw ro etaf yb detanimod evah lliw eerf .

tropeR ytironiM a si noitcif ecneics yrots trohs yb pilihPkciD .K ni dehsilbup tsrif 6591 . ,eivom A ytironiMtropeR ( 2002 gnirrats ,) esiurC moT yb detcerid dna

grebleipS nevetS

Fig. 8 Movie Minority Report, the age of human thoughts

A very precise assertion is almost never certain (“I am 1.71cm tall)” Principle ofincompatibility.” (Pierre Duhem)

“Was Einstein misguided? Must we accept that there is a fuzzy, probabilisticquantum arena lying just beneath the definitive experiences of everyday reality?As of today, we still don’t have a final answer. Fifty years after Einstein’s death,

aidepolcycnE--aidepikiW

Mind-Manipulation Invention

Fig. 9 Movie Batman Forever, the age of reading the thought through human brain activities

Page 61: Forging New Frontiers: Fuzzy Pioneers I

50 M. Nikravesh

egayoV citsatnaF a si 6691 mlif noitcif ecneics yb nettirw renielK yrraH . xoF yrutneC ht02

eht htiw ni-eit a eb dluow taht koob a detnaw derih dna ,eivom vomisA caasI a etirw ot

.yalpneercs eht no desabnoitazilevon

Fig. 10 Movie Fantastic Voyage, the age of nano-medical devices

however, the scales have certainly tipped farther in this direction.” (Brian Greene,NYT (2005)) [28]

“One is tempted to rewrite Quantum Mechanics using Fuzzy Theory instead ofProbability Theory. After all, Quantum Mechanics, upon which our description ofmatter is built, uses probabilities mainly for historical reasons: Probability Theorywas the only theory of uncertainty available at the time. Today, we have a standardinterpretation of the world which is based on population thinking: we cannot talkabout a single particle, but only about sets of particles. We cannot know whethera particle will end up here or there, but only how many particles will end up hereor there. The interpretation of quantum phenomena would be slightly different ifQuantum Mechanics was based on Fuzzy Logic: probabilities deal with popula-tions, whereas Fuzzy Logic deals with individuals; probabilities entail uncertainty,whereas Fuzzy Logic entails ambiguity. In a fuzzy universe a particle’s positionwould be known at all times, except that such a position would be ambiguous (aparticle would be simultaneously “here” to some degree and “there” to some otherdegree). This might be viewed as more plausible, or at least more in line with ourdaily experience that in nature things are less clearly defined than they appear in amathematical representation of them.” Piero Scaruffi (2003)

“What I believe, and what as yet is widely unrecognized, is that, the gene-sis of computing with words and the computational theory of perceptions in my1999 paper [26], “From Computing with Numbers to Computing with Words—fromManipulation of Measurements to Manipulation of Perceptions,” will be viewed, inretrospect, as an important development in the evolution of fuzzy logic, markingthe beginning of the third phase, 1999–. Basically, development of computing with

Page 62: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 51

words and perceptions brings together earlier strands of fuzzy logic and suggeststhat scientific theories should be based on fuzzy logic rather than on Aristotelian,bivalent logic, as they are at present. A key component of computing with words isthe concept of Precisiated Natural Language (PNL). PNL opens the door to a majorenlargement of the role of natural languages in scientific theories. It may well turnout to be the case that, in coming years, one of the most important application-areasof fuzzy logic, and especially PNL, will be the Internet, centering on the conceptionand design of search engines and question-answering systems. From its inception,fuzzy logic has been—and to some degree still is—an object of skepticism andcontroversy. In part, skepticism about fuzzy logic is a reflection of the fact that,in English, the word “fuzzy” is usually used in a pejorative sense. But, more impor-tantly, for some fuzzy logic is hard to accept because by abandoning bivalence itbreaks with centuries-old tradition of basing scientific theories on bivalent logic. Itmay take some time for this to happen, but eventually abandonment of bivalence willbe viewed as a logical development in the evolution of science and human thought.”Lotfi A. Zadeh (2005)

6 Conclusions

Many of Zadeh’s papers written in the eighties and early nineties were concerned,for the most part, with applications of fuzzy logic to knowledge representation andcommonsense reasoning. Then, in l999, a major idea occurred to him—an ideawhich underlies most of his current research activities. This idea was described ina seminal paper entitled “From Computation with Numbers to Computation withWords—From Manipulation of Measurements to Manipulation of Perceptions.” His1999 paper initiated a new direction in computation which he called Computingwith Words (CW).

In Zadeh’s view, Computing with Words (CW) opens a new chapter in the evo-lution of fuzzy logic and its applications. It has led him to initiation of a num-ber of novel theories, including Protoform Theory (PFT), Theory of HierarchicalDefinability (THD), Perception-based Probability Theory (PTt), Perception-basedDecision Analysis (PDA) and the Unified Theory of Uncertainty (UTU) (Figs.11–12). Zadeh believes that these theories will collectively have a wide-rangingimpact on scientific theories, especially those in which human perceptions play acritical role—as they do in economics, decision-making, knowledge management,information analysis and other fields.

Successful applications of fuzzy logic and its rapid growth suggest that the im-pact of fuzzy logic will be felt increasingly in coming years. Fuzzy logic is likelyto play an especially important role in science and engineering, but eventually itsinfluence may extend much farther. In many ways, fuzzy logic represents a signif-icant paradigm shift in the aims of computing - a shift which reflects the fact thatthe human mind, unlike present day computers, possesses a remarkable ability tostore and process information which is pervasively imprecise, uncertain and lackingin categoricity.

Page 63: Forging New Frontiers: Fuzzy Pioneers I

52 M. Nikravesh

EVOLUTION OF FUZZY LOGIC –SHIFTS IN DIRECTION

1965

1973

1999

f-generalization+

f.g-generalization+

computing with words + PTp

crisp logic+ PT

PT= standard probability theory (measurement-based)

PTp= p-generalization of PT=perception-based probability theory

PT += f-generalization of PT

= f.g-generalization of PT++PT

+PT

++PT

Fig. 11 Evolution of Fuzzy Logic (Zadeh’s Logic)

FROM COMPUTING WITH NUMBERS TO COMPUTING WITH WORDSFROM COMPUTING WITH WORDS TO THE COMPUTATIONAL THEORY OF PERCEPTONS

AND THE THEORY OF HIERACHICAL DEFINABILITY

CN

CI GrC PNL

CW+ + +

computing with numbers

computing with intervals

GrC: computing with granules

PNL: precisiated natural language

CSNL: constraint -centered semantics of natural languages

GCL: generalized constraint language

computing with words

CTPgranulation CSNL THD

generalized constraint

GCL

CTP: computational theory of perceptions

THD: theory of hierarchical definability

Fig. 12 Fuzzy Logic, New Tools

Acknowledgments Funding for this research was provided in part by the British Telecommunica-tion (BT), OMRON, Tekes, Chevron-Texaco, Imaging and Informatics-Life Science of LawrenceBerkeley National Laboratory and the BISC Program of UC Berkeley. The author would liketo thanks Prof. Lotfi A. Zadeh—UC Berkeley and Dr. James S. Albus Senior NIST Fellow forallowing to use some of their presentation materials.

Page 64: Forging New Frontiers: Fuzzy Pioneers I

Evolution of Fuzzy Logic: From Intelligent Systems and Computation to Human Mind 53

References

1. Zadeh, Lotfi A. (1965). Fuzzy sets. Information and Control 8, 378–53.2. Zadeh, Lotfi A. (2005). Toward a Generalized Theory of Uncertainty (GTU)—An Outline,

Information Sciences.3. Scaruffi, Piero (2003). Thinking About Thought, iUniverse publishing4. Savage, Leonard (1954) THE FOUNDATIONS OF STATISTICS, John Wiley.5. Shafer, Glenn (1976). A MATHEMATICAL THEORY OF EVIDENCE, Princeton Univ

Press.6. Black, Max (1937). Vagueness, an exercise in logical analysis. Philosophy of Science 6,

427–55.7. Russell, Bertrand (1923). Vagueness. Australian Journal of Philosophy 1, 88–91.8. Brock, Jarrett E. (1979). Principle themes in Peirce’s logic of vagueness. In Peirce Studies 1,

41–50. Lubbock: Institute for Studies in Pragmaticism.9. Nadin, Mihai (1982). Consistency, completeness and the meaning of sign theories. American

Journal of Semiotics l (3), 79–98.10. Nadin, Mihai (1983). The logic of vagueness and the category of synechism. In The Relevance

of Charles Peirce, E. Freeman (ed.), l54–66. LaSalle, IL: Monist Library of Philosophy.11. Engel-Tiercelin, Claudine (1992). Vagueness and the unity of C. S. Peirce’s realism. Transac-

tions of the C. S. Peirce Society 28 (1), 51–82.12. Merrell, Floyd (1995). Semiosis in the Postmodern Age. West Lafayette: Purdue University

Press.13. Merrell, Floyd (1996). Signs Grow: Semiosis and Life Processes. Toronto: University of

Toronto Press.14. Merrell, Floyd (1997). Peirce, Signs, and Meaning. Toronto: University of Toronto Press.15. Merrell, Floyd (1998). Sensing Semiosis: Toward the Possibility of Complementary Cultural

‘Logics’. New York: St. Martin’s Press.16. Zadeh, Lotfi A. (1950). An extension of Wiener’s theory of prediction, (with J. R. Ragazzini),

J. Appl. Phys. 21, 645–655.17. Zadeh, Lotfi A. (1952). The analysis of sampled-data systems, (with J. R. Ragazzini), Appli-

cations and Industry (AIEE) 1, 224–234.18. Zadeh, Lotfi A. (1963). Linear System Theory-The State Space Approach, (co-authored with

C. A. Desoer). New York: McGraw-Hill Book Co., 1963.19. Lukaszewicz Witold (1990). NON-MONOTONIC REASONING, Ellis Harwood.20. Zadeh, Lotfi A. (1973). Outline of a new approach to the analysis of complex systems and

decision processes, IEEE Trans. on Systems, Man and Cybernetics SMC-3, 28–44.21. Zadeh, Lotfi A. (1975). Fuzzy Logic and Approximate Reasoning, Synthese, 30, 407–28.22. Zadeh, Lotfi A. (1970). Decision-making in a fuzzy environment, (with R. E. Bellman), Man-

agement Science 17, B- 141-B-164.23. Zadeh, Lotfi A. (1972). Fuzzy languages and their relation to human and machine intelligence,

Proc. of Intl. Conf. on Man and Computer, Bordeaux, France, 130–165.24. Zadeh, Lotfi A. (1979). A theory of approximate reasoning, Machine Intelligence 9, J. Hayes,

D. Michie, and L. I Mikulich (eds.), 149–194. New York: Halstead Press.25. Zadeh, Lotfi A. (1992). Fuzzy Logic for the Management of Uncertainty, L.A. Zadeh and

J. Kacprzyk (Eds.), John Wiley & Sons, New York.26. Zadeh, Lotfi A. (1999). From Computing with Numbers to Computing with Words – From

Manipulation of Measurements to Manipulation of Perceptions, IEEE Transactions on Circuitsand Systems, 45, 105–119.

27. Peirce, Charles S. (1908). CP 6.475, CP 6.458, CP 6.461 [CP refers to Collected Papers ofCharles Sanders Peirce, C. Hartshorne, P. Weiss and A. W. Burks, eds., Harvard UniversityPress, Cambridge, Mass, 1931–58; the first refers to the volum, the number after the dot to theparagraph, and the last number to the year of the text].

28. Green, Brian (2005). One Hundred Years of Uncertainty, NYT, Op-Ed 2818 words, Late Edi-tion - Final , Section A , Page 27 , Column 1.

Page 65: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzzinessin the 20th Century1

Rudolf Seising

Abstract This contribution deals with developments in the history of philosophy,logic, and mathematics during the time before and up to the beginning of fuzzylogic. Even though the term “fuzzy” was introduced by Lotfi A. Zadeh in 1964/65,it should be noted that older concepts of “vagueness” and “haziness” had previouslybeen discussed in philosophy, logic, mathematics, applied sciences, and medicine.This paper delineates some specific paths through the history of the use of these“loose concepts”. Vagueness was avidly discussed in the fields of logic and phi-losophy during the first decades of the 20th century – particularly in Vienna, atCambridge and in Warsaw and Lvov. An interesting sequel to these developmentscan be seen in the work of the Polish physician and medical philosopher LudwikFleck.

Haziness and fuzziness were concepts of interest in mathematics and engineeringduring the second half of the 1900s. The logico-philosophical history presented herecovers the work of Bertrand Russell, Max Black, and others. The mathematical-technical history deals with the theories founded by Karl Menger and Lotfi Zadeh.Menger’s concepts of probabilistic metrics, hazy sets (ensembles flous) and microgeometry as well as Zadeh’s theory of fuzzy sets paved the way for the establishmentof soft computing methods using vague concepts that connote the nonexistence ofsharp boundaries. The intention of this chapter is to show that these developments(historical and philosophical) were preparatory work for the introduction of softcomputing as an artificial intelligence approach in 20th century technology.

Rudolf SeisingMedical University of Vienna, Core Unit for Medical Statistics and Informatics, Vienna, Austriae-mail: [email protected]

1 This chapter is a modified and revised version of my conference contributions to the IFSA 2005World Congress [1] and the BISC Special Event ’05 Forging New Frontiers. 40th of Fuzzy Pioneers(1965–2005) in Honour of Prof. Lotfi A. Zadeh [2]. It accompanies a series of articles on thehistory of the theory of fuzzy sets that will appear in the near future, each in a different journal.An expanded survey of the author’s historical reconstruction will appear in [3]. Four additionalarticles [4, 5, 6, 7] will highlight different aspects of this history. The idea underlying this projectcame from my friend Thomas Runkler, who suggested it during the FUZZ-IEEE 2004 conferencein Budapest. Subsequently, the journal’s chief editors encouraged me to realize the project.

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 55C© Springer 2007

Page 66: Forging New Frontiers: Fuzzy Pioneers I

56 R. Seising

1 Introduction

Exact concepts, i.e. concepts with strict boundaries, bivalent logic that enables us todecide yes or no, mathematical formulations to represent sharp values of quantities,measurements and other terms are tools of logic and mathematics that have givenmodern science its exactness since Galileo (1564–1642) and Descartes (1596–1650)in the 17th century. It has been possible to formulate axioms, definitions, theorems,and proofs in the language of mathematics. Moreover, the ascendancy of modernscience achieved through the works of Newton (1643–1727), Leibniz (1646–1716),Laplace (1749–1827) and many others made it seem that science was able to repre-sent all the facts and processes that people observe in the world. But scientists arehuman beings, who also use natural languages that clearly lack the precision of logicand mathematics. Human language – and possibly also human understanding – arenot strictly defined systems. On the contrary, they include a great deal of uncertainty,ambiguities, etc. They are, in fact, vague.

“Vagueness” is also part of the vocabulary of modern science. In his EssayConcerning Human Understanding (1689), John Locke (1632–1704) complainedabout the “vague and insignificant forms of speech” and in the French translation(1700), the French word “vague” is used for the English word “loose”. Nevertheless,“vague” did not become a technical term in philosophy and logic during the 18thand 19th century. In the 20th century, however, philosophers like Gottlob Frege(1848–1925) (Fig. 1, left side), Bertrand Russell (1872–1970) (Fig. 3, left side), MaxBlack (1909–1988) (Fig. 3, middle), and others focused attention on and analyzedthe problem of “vagueness” in modern science.

A separate and isolated development took place at the Lvov-Warsaw School oflogicians in thework ofKazimierzTwardowski (1866–1938) (Fig.1,middle),TadeuszKotarbinski (1886–1981), Kazimierz Ajdukiewicz (1890–1963) (Fig. 1, right side),Jan Łukasiewicz (1878–1956), Stanisław Lesniewski (1886–1939), Alfred Tarski(1901–1983), and many other Polish mathematicians and logicians. Their importantcontributions to modern logic were recognized when Tarski gave a lecture to the Vi-enna Circle in September 1929 – following an invitation extended after the Viennesemathematician Karl Menger (1902–1985) had got to know the Lvov-Warsaw scholarsduring his travels to Warsaw the previous summer. It turned out that these thinkershad been influenced by Frege’s studies. This was especially true of Kortabinski,who argued that a concept for a property is vague (Polish: chwiejne) if the prop-erty may be the case by grades [8], and Ajdukiewicz, who stated the definition that“a term is vague if and only if its use in a decidable context. . . will make the con-text undecidable in virtue of those [language] rules” [9]. The Polish characterizationof “vagueness” was therefore the existence of fluid boundaries [8, 9].

In the first third of the 20th century there were several groups of European sci-entists and philosophers who concerned themselves with the interrelationships be-tween logic, science, and the real world, e.g. in Berlin, Cambridge, Warsaw, and theso-called Vienna Circle. The scholars in Vienna regularly debated these issues overa period of years until the annexation of Austria by Nazi Germany in 1938 markedthe end of the group. One member of the Vienna Circle was Karl Menger, wholater became a professor of mathematics in the USA. As a young man in Vienna,

Page 67: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 57

Fig. 1 Gottlob Frege, Tadeusz Kortabinski, Kazimierz Adjukiewicz

Menger raised a number of important questions that culminated in the so-calledprinciple of logical tolerance. In addition, in his work after 1940 on the probabilisticor statistical generalization of metric space, he introduced the new concepts “hazysets” (ensembles flous), t-norms and t-conorms, which are also used today in themathematical treatment of problems of vagueness in the theory of fuzzy sets.

This new mathematical theory to deal with vagueness was established in the mid1960s by Lotfi A Zadeh (born 1921) (Fig. 6, right side), who was then a professorof electrical engineering at Berkeley. In 1962 he described the basic necessity of anew scientific tool to handle very large and complex systems in the real world: “weneed a radically different kind of mathematics, the mathematics of fuzzy or cloudyquantities which are not describable in terms of probability distributions. Indeed, theneed for such mathematics is becoming increasingly apparent even in the realm ofinanimate systems, for in most practical cases the a priori data as well as the criteriaby which the performance of a man-made system are judged are far from being pre-cisely specified or having accurately-known probability distributions” ([10], p. 857).In the two years following the publication of this paper, Zadeh developed the theoryof fuzzy sets [11, 12, 13], and it has been possible to reconstruct the history of thisprocess [1, 2, 3, 14, 15, 16, 17, 18, 19].

Very little is known about the connectivity between the philosophical work onvagueness and the mathematical theories of hazy sets and fuzzy sets. In this contri-bution the author shows that there is common ground in the scientific developmentsthat have taken place in these different disciplines, namely, the attempt to find a wayto develop scientific methods that correspond to human perception and knowledge,which is not the case with the exactness of modern science.

2 Vagueness – Loose Concepts and Borderless Phenomena

2.1 Logical Analysis of Vagueness

In his revolutionary book Begriffsschrift (1879) the German philosopher and math-ematician Gottlob Frege confronted the problem of vagueness when formalizing

Page 68: Forging New Frontiers: Fuzzy Pioneers I

58 R. Seising

the mathematical principle of complete induction: he saw that some predicates arenot inductive, viz. they have been defined for all natural numbers, but they resultin false conclusions, e.g. the predicate “heap” cannot be evaluated for all naturalnumbers [20]. Ten years later, when Frege revised the basics of his Begriffsschrift fora lecture to the Society of Medicine and Science in Jena at the beginning year 1891,he reinterpreted concepts functions and consequently he introduced these functionsof concepts everywhere. He stated: If “x+1” is meaningless for all arguments x, thenthe function x+1=10 has no value and no truth value either. Thus, the concept “thatwhich when increased by 1 yields 10” would have no sharp boundaries. Accord-ingly, for functions the demand on sharp boundaries entails that they must have avalue for every argument [21]. This is a mathematical verbalization of what is calledthe classical sorites paradox that can be traced back to the old Greek word σoρως

(sorós for “heap”) used by Eubulid of Alexandria (4th century BC).In his Grundgesetze der Arithmetik (Foundations of Arithmetic) that appeared

in the years 1893–1903, Frege called for concepts with sharp boundaries, becauseotherwise we could break logical rules and, moreover, the conclusions we drawcould be false [22].

Frege’s specification of vagueness as a particular phenomenon influenced otherscholars, notably his British contemporary and counterpart the philosopher andmathematician Bertrand Russell, who published the first logico-philosophical ar-ticle on “Vagueness” in 1923 [23]. Russell quoted the sorites – in fact, he did notuse mathematical language in this article, but, for example, discussed colours and“bald men” (Greek: falakros, English: fallacy, false conclusion): “Let us considerthe various ways in which common words are vague, and let us begin with such aword as ‘red’. It is perfectly obvious, since colours form a continuum, that there areshades of colour concerning which we shall be in doubt whether to call them red ornot, not because we are ignorant of the meaning of the word ‘red’, but because it isa word the extent of whose application is essentially doubtful. This, of course, is theanswer to the old puzzle about the man who went bald. It is supposed that at firsthe was not bald, that he lost his hairs one by one, and that in the end he was bald;therefore, it is argued, there must have been one hair the loss of which convertedhim into a bald man. This, of course, is absurd. Baldness is a vague conception;some men are certainly bald, some are certainly not bald, while between them thereare men of whom it is not true to say they must either be bald or not bald.” ([23],p. 85). Russell showed in this article from 1923 that concepts are vague even thoughthere have been and continue to be many attempts to define them precisely: “Themetre, for example, is defined as the distance between two marks on a certain rod inParis, when that rod is at a certain temperature. Now, the marks are not points, butpatches of a finite size, so that the distance between them is not a precise concep-tion. Moreover, temperature cannot be measured with more than a certain degree ofaccuracy, and the temperature of a rod is never quite uniform. For all these reasonsthe conception of a metre is lacking in precision.” ([23], p. 86)

Russell also argued that a proper name – and here we can take as an example thename “Lotfi Zadeh” – cannot be considered to be an unambiguous symbol even ifwe believe that there is only one person with this name. Lotfi Zadeh “was born, andbeing born is a gradual process. It would seem natural to suppose that the name wasnot attributable before birth; if so, there was doubt, while birth was taking place,

Page 69: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 59

whether the name was attributable or not. If it be said that the name was attributablebefore birth, the ambiguity is even more obvious, since no one can decide how longbefore birth the name become attributable.” ([23], p. 86) Russell reasoned “that allwords are attributable without doubt over a certain area, but become questionablewithin a penumbra, outside which they are again certainly not attributable.” ([23],p. 86f) Then he generalized that words of pure logic also have no precise meanings,e.g. in classical logic the composed proposition “p or q” is false only when p and qare false and true elsewhere.

He went on to claim that the truth values “‘true’ and ‘false’ can only have aprecise meaning when the symbols employed – words, perceptions, images . . . – arethemselves precise”. As we have seen above, this is not possible in practice, so heconcludes “that every proposition that can be framed in practice has a certain degreeof vagueness; that is to say, there is not one definite fact necessary and sufficientfor its truth, but certain region of possible facts, any one of which would make ittrue. And this region is itself ill-defined: we cannot assign to it a definite boundary.”Russell emphasized that there is a difference between what we can imagine in theoryand what we can observe with our senses in reality: “All traditional logic habituallyassumes that precise symbols are being employed. It is therefore not applicableto this terrestrial life, but only to an imagined celestial existence.” ([23], p. 88f).He proposed the following definition of accurate representations: “One system ofterms related in various ways is an accurate representation of another system ofterms related in various other ways if there is a one-one relation of the terms ofthe one to the terms of the other, and likewise a one-one relation of the relationsof the one to the relations of the other, such that, when two or more terms in theone system have a relation belonging to that system, the corresponding terms ofthe other system have the corresponding relation belonging to the other system.”And in contrast to this, he stated that “a representation is vague when the rela-tion of the representing system to the represented system is not one-one, but one-many.” ([23], p. 89) He concluded that “Vagueness, clearly, is a matter of degree,depending upon the extent of the possible differences between different systemsrepresented by the same representation. Accuracy, on the contrary, is an ideal limit.”([23], p. 90).

The Cambridge philosopher and mathematician Max Black responded to Rus-sell’s article in “Vagueness. An exercise in logical analysis”, published in 1937. [24].Black was born in Baku, the capital of the former Soviet Republic Azerbaijan.2

Because of the anti-Semitism in their homeland, his family immigrated to Europe,first to Paris and then to London, where Max Black grew up after 1912. He studiedat the University of Cambridge and took his BA. in 1930. He went to Göttingen inGermany for the academic year of 1930/31 and afterwards continued his studies inLondon, where he received a Ph D in 1939. His doctoral dissertation was entitled“Theories of logical positivism” [..]. After 1940 he taught at the Department ofPhilosophy at the University of Illinois at Urbana and in 1946 he became a professorat Cornell University in Ithaca, New York.

2 It is a curious coincidence that Max Black and Lotfi Zadeh, the founder of the theory of fuzzysets, were both in the same city and both became citizens of the USA in the 1940s.

Page 70: Forging New Frontiers: Fuzzy Pioneers I

60 R. Seising

Influenced by Russell and Wittgenstein (and the other famous analytical philoso-phers at Cambridge, Frank P. Ramsey (1903–1930) and George E. Moore(1873–1958), in 1937 Black continued Russell’s approach to the concept of vague-ness, and he differentiated vagueness from ambiguity, generality, and indetermi-nacy. He emphasized “that the most highly developed and useful scientific theoriesare ostensibly expressed in terms of objects never encountered in experience. Theline traced by a draughtsman, no matter how accurate, is seen beneath the mi-croscope as a kind of corrugated trench, far removed from the ideal line of puregeometry. And the ‘point-planet’ of astronomy, the ‘perfect gas’ of thermodynam-ics, and the ‘pure species’ of genetics are equally remote from exact realization.”([24], p. 427)

Black proposed a new method to symbolize vagueness: “a quantitative differen-tiation, admitting of degrees, and correlated with the indeterminacy in the divisionsmade by a group of observers.” ([24], p. 441) He assumed that the vagueness ofa word involves variations in its application by different users of a language andthat these variations fulfill systematic and statistical rules when one symbol has tobe discriminated from another. He defined this discrimination of a symbol x withrespect to a language L by Dx L(= Dx¬L) (Fig. 2).

Most speakers of a language and the same observer in most situations will deter-mine that either L or ¬L is used. In both cases, among competent observers thereis a certain unanimity, a preponderance of correct decisions. For all DxL with thesame x but not necessarily the same observer, m is the number of L uses and n thenumber of ¬L uses. On this basis, Black stated the following definition: “We definethe consistency of application of L to x as the limit to which the ratio m/n tendswhen the number of DxL and the number of observers increase indefinitely. [. . .]Since the consistency of the application, C , is clearly a function of both L and x , itcan be written in the form C(L, x).” ([24], p. 442)

Fig. 2 Consistency of application of a typical vague symbol ([24], p. 443)

Page 71: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 61

Fig. 3 Bertrand Russell, Max Black, Ludwik Fleck

More than a quarter century later and two years before Zadeh introduced fuzzysets into science, Black published “Reasoning with loose concepts”. In this arti-cle he labelled concepts without precise boundaries as “loose concepts” rather than“vague” ones, in order to avoid misleading and pejorative implications [25]. Onceagain he expressly rejected Russell’s assertion that traditional logic is “not applica-ble” as a method of conclusion for vague concepts: “Now, if all empirical conceptsare loose, as I think they are, the policy becomes one of abstention from any rea-soning from empirical premises. If this is a cure, it is one that kills the patient.If it is always wrong to reason with loose concepts, it will, of course, be wrong toderive any conclusion, paradoxical or not, from premises in which such concepts areused. A policy of prohibiting reasoning with loose concepts would destroy ordinarylanguage – and, for that matter, any improvement upon ordinary language that wecan imagine.” ([25], p. 7)

2.2 Phenomena of Diseases Do Not Have Strict Boundaries

Reasoning with loose concepts is standard practice in medical thinking. In 1926the young Polish physician and philosopher Ludwik Fleck (1896–1961) (Fig. 3,right side) expressed this fact in a striking manner: “It is in medicine that one en-counters a unique case: the worse the physician, the ‘more logical’ his therapy.”Fleck said this in a lecture to the Society of Lovers of the History of Medicineat Lvov entitled Some specific features of the medical way of thinking, whichwas published in 1927 (in Polish) [26]. Even though he was close to the Pol-ish school of logic, he opposed the view that medical diagnoses are the result ofstrong logical reasoning. He thought that elements of medical knowledge, symp-toms and diseases are essentially indeterminate and that physicians rely on theirintuition rather than on logical consequences to deduce a disease from a patient’ssymptoms.

Page 72: Forging New Frontiers: Fuzzy Pioneers I

62 R. Seising

Ludwik Fleck was born in Lvov (Poland), where he received his medical degreeat the Jan Kazimierz University. When Nazi Germany attacked the Soviet Union in1939 and German forces occupied Lvov, Fleck was deported to the city’s Jewishghetto. In 1943 he was sent to the Auschwitz concentration camp. From the end of1943 to April 1945, he was detained in Buchenwald, where he worked in a labora-tory set up by the SS for the production and study of production methods for typhusserum.

After the Second World War, Fleck served as the head of the Institute of Mi-crobiology of the School of Medicine in Lublin and then as the Director of theDepartment of Microbiology and Immunology at a state institute in Warsaw. After1956 he worked at the Israel Institute for Biological Research in Ness-Ziona. Hedied at the age of 64 of a heart attack.

Fleck was both a medical scientist and a philosopher. In 1935 he wrote Gene-sis and Development of a Scientific Fact (in German) [27], but unfortunately thisvery important work was unknown in most parts of the world until it was trans-lated into English in the late 1970s [28]. In it Fleck anticipated many of ThomasKuhn’s ideas on the sociological and epistemological aspects of scientific develop-ment that Kuhn published in his very influential book The Structure of ScientificRevolutions [29].

In Fleck’s philosophy of science, sciences grow like living organisms and thedevelopment of a scientific fact depends on particular “thought-styles”. Fleck deniedthe existence of any absolute and objective criteria of knowledge; on the contrary,he was of the opinion that different views can be true. He suggested that truth inscience is a function of a particular “thinking style” by the “thought-collective”,that is, a group of scientists or people “exchanging ideas or maintaining intellectualinteraction”. Fleck adopted this “notion of the relativity of truth” from his researchon philosophy of medicine, where he pointed out that diseases are constructed byphysicians and do not exist as objective entities.

Some specific features of the medical way of thinking begins with the followingsentence: “Medical science, whose range is as vast as its history is old, has led tothe formation of a specific style in the grasping of its problems and of a specific wayof treating medical phenomena, i.e. to a specific type of thinking.” A few lines later,he exemplified this assumption: “Even the very subject of medical cognition differsin principle from that of scientific cognition. A scientist looks for typical, normalphenomena, while a medical man studies precisely the atypical, abnormal, morbidphenomena. And it is evident that he finds on this road a great wealth and rangeof individuality of these phenomena which form a great number without distinctlydelimited units, and abounding in transitional, boundary states. There exists no strictboundary between what is healthy and what is diseased, and one never finds exactlythe same clinical picture again. But this extremely rich wealth of forever differentvariants is to be surmounted mentally, for such is the cognitive task of medicine.How does one find a law for irregular phenomena? – This is the fundamentalproblem of medical thinking. In what way should they be grasped and what rela-tions should be adopted between them in order to obtain a rational understanding?”([30], p. 39)

Page 73: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 63

Fleck emphasized two points: The first of these was the impact of the knowledgeexplosion in medical science. Accelerated progress in medical research had led to anenormous number of highly visible disease phenomena. Fleck argued that medicalresearch has “to find in this primordial chaos, some laws, relationships, some typesof higher order”. He appreciated the vital role played by statistics in medicine, but heraised the objections that numerous observations “eliminate the individual characterof the morbid element” and “the statistical observation itself does not create the fun-damental concept of our knowledge, which is the concept of the clinical unit.” ([30],p. 39.) Therefore “abnormal morbid phenomena are grouped round certain types,producing laws of higher order, because they are more beautiful and more generalthan the normal phenomena which suddenly become profoundly intelligible. Thesetypes, these ideal, fictitious pictures, known as morbid units, round which both theindividual and the variable morbid phenomena are grouped, without, however, evercorresponding completely to them – are produced by the medical way of thinking,on the one hand by specific, far-reaching abstraction, by rejection of some observeddata, and on the other hand, by the specific construction of hypotheses, i.e. by guess-ing of non observed relations.” ([30], p. 40)

The second point that concerned Fleck was the absence of sharp borders betweenthese phenomena: “In practice one cannot do without such definitions as ‘chill’,‘rheumatic’ or ‘neuralgic’ pain, which have nothing in common with this bookishrheumatism or neuralgia. There exist various morbid states and syndromes of sub-jective symptoms that up to now have failed to find a place and are likely not to findit at any time. This divergence between theory and practice is still more evident intherapy, and even more so in attempts to explain the action of drugs, where it leadsto a peculiar pseudo-logic.” ([30], p. 42)

Clearly, it is very difficult to define sharp borders between various symptomsin the set of all symptoms and between various diseases in the set of diseases,respectively. On the contrary, we can observe smooth transitions from one entityto another and perhaps a very small variation might be the reason why a medicaldoctor diagnoses a patient with disease x instead of disease y.

Therefore Fleck stated that physicians use a specific style of thinking when theydeliberate on attendant symptoms and the diseases patients suffer from. Of course,Fleck could not have known anything about the methods of fuzzy set theory, buthe was a philosopher of vagueness in medical science. He contemplated a “spaceof phenomena of disease” and realized that there are no boundaries either in a con-tinuum of phenomena of diseases or between what is diseased and what is healthy.Some 40 years later Lotfi Zadeh proposed to handle similar problems with his newtheory of fuzzy sets when he lectured to the audience of the International Sympo-sium on Biocybernetics of the Central Nervous System: “Specifically, from the pointof view of fuzzy set theory, a human disease, e.g., diabetes may be regarded as afuzzy set in the following sense. Let X = {x} denote the collection of human beings.Then diabetes is a fuzzy set, say D, in X, characterized by a membership functionμD(x) which associates with each human being x his grade of membership in thefuzzy set of diabetes” ([31], p. 205). This “fuzzy view” on vagueness in medicinewas written without any knowledge of Fleck’s work, but it offers an interesting view

Page 74: Forging New Frontiers: Fuzzy Pioneers I

64 R. Seising

of the history of socio-philosophical and system theoretical concepts of medicine.For more details the reader is referred to [32].

3 Haziness – A Micro Geometrical Approach

Karl Menger (Fig. 5, right side), a mathematician in Vienna and later in the USA,was one of the first to begin laying the groundwork for human-friendly science, i.e.for the development of scientific methods to deal with loose concepts. In his work,he never abandoned the framework of classical mathematics, but used probabilisticconcepts and methods.

Karl Menger was born in Vienna, Austria, and he entered the University ofVienna in 1920. In 1924, after he had received his PhD, he moved to Amsterdambecause of his interests in topology. There, he became the assistant of the famousmathematician and topologist Luitzen E. J. Brouwer (1881–1966). In 1927 Mengerreturned to Vienna to become professor of geometry at the University and he be-came a member of the Vienna Circle. In 1937 – one year before the annexation ofAustria by Nazi Germany – he immigrated to the USA, where he was appointedto a professorship at the University of Notre Dame. In 1948 Menger moved to theIllinois Institute of Technology, and he remained in Chicago for the rest of his life.

3.1 Logical Tolerance

When Menger came back to Vienna from Amsterdam as a professor of geometry,he was invited to give a lecture on Brouwer’s intuitionism to the Vienna Circle. Inthis talk he rejected the view held by almost all members of this group that thereis one unique logic. He claimed that we are free to choose axioms and rules inmathematics, and thus we are free to consider different systems of logic. He real-ized the philosophical consequences of this assumption, which he shared with hisstudent Kurt Gödel (1906–1978): “the plurality of logics and language entailingsome kind of logical conventionalism” ([33], p. 88). Later in the 1930s, when theVienna Circle became acquainted with different systems of logics – e.g. the threeand multi-valued logic founded by Jan Łukasiewicz (1878–1956), which was alsodiscussed by Tarski in his Vienna lecture, and Brouwer’s intuitionistic logic – RudolfCarnap (1891–1970) also defended this tolerant view of logic. In his lecture TheNew Logic: A 1932 Lecture, Menger wrote. “What interests the mathematician andall that he does is to derive propositions by methods which can be chosen in variousways but must be listed. And to my mind all that mathematics and logic can sayabout this activity of mathematician (which neither needs justification nor can bejustified) lies in this simple statement of fact.” ([34], p. 35.) This “logical tolerance”proposed by Menger later became well-known in Carnap’s famous book LogischeSyntax der Sprache (Logical Syntax of Language) in 1934 [35]. It is a basic principlefor the contemplation of “deviant logics”, that is, systems of logic that differ fromusual bivalent logic, one of which is the logic of fuzzy sets.

Page 75: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 65

3.2 Statistical Metrics, Probabilistic Geometry, the Continuumand Poincaré’s Paradox

In Vienna in the 1920s and 1930s, Menger evolved into a specialist in topologyand geometry, particularly with regard to the theories of curves, dimensions, andgeneral metrics. After he emigrated to the USA, he tied up to theses subjects. In1942, with the intention of generalizing the theory of metric spaces more in thedirection of probabilistic concepts, he introduced the term “statistical metric”: Astatistical metric is “a set S such that with each two elements (‘points’) p andq of S, a probability function Π(x; p, q) is associated satisfying the followingconditions:

1. Π(0; p, p) = 1.2. If p �= q , then Π(0; p, q) < 1.3. Π(x; p, q) = Π(x; q, p)

4. T [Π(x; p, q), Π(y; q, r)] ≤ Π(x + y; p, r).

where T [α, β] is a function defined for 0 ≤ α ≤ 1 and 0 ≤ β ≤ 1 such that

(a) 0 ≤ T (α, β) ≤ 1.(b) T is non-decreasing in either variable.(c) T (α, β) = T [β, α].(d) T (1, 1) = 1.(e) If α > 0, then T (α, 1) > 0.” ([36], p. 535f)

Menger called Π(x; p, q) the distance function of p and q , which bears the meaningof the probability that the points p and q have a distance ≤ x . Condition 4, the“triangular inequality” of the statistical metric S implies the following inequalityfor all points q and all numbers x between 0 and z:

Π(z; p, r) ≥ Max T [Π(x; p, q),Π(z − x; q, r)].

In this paper Menger used the name triangular norm (t-norm) to indicate thefunction T for the first time.

Almost 10 years later, in February 1951, Menger wrote two more notes to the Na-tional Academy of Sciences. In the first note “Probabilistic Geometry” [37], amongother things he introduced a new notation ab for the non-decreasing cumulativedistribution function, associated with every ordered pair (a, b) of elements of a set Sand he presented the following definition: “The value ab(x) may be interpreted asthe probability that the distance from a to b be < x .” ([37], p. 226.) Much more in-teresting is the following text passage: “We call a and a′ certainly-indistinguishableif aa′(x) = 1 for each x > 0. Uniting all elements which are certainly indis-tinguishable from each other into identity sets, we decompose the space into dis-joint sets A, B , . . . We may define AB(x) = ab(x) for any a belonging to Aand b belonging to B . (The number is independent of the choice of a and b.) The

Page 76: Forging New Frontiers: Fuzzy Pioneers I

66 R. Seising

identity sets form a perfect analog of an ordinary metric space since they satisfy thecondition

If A �= B , then there exists a positive x with AB(x) < 1.”3

In the paper “Probabilistic Theories of Relations” of the same year [38], Mengeraddressed the difference between the mathematical and the physical continuum.Regarding A, B , and C as elements of a continuum, he referred to the Frenchmathematician and physicist Henri Poincaré’s (1854–1912) (Fig. 5, middle) claim“that only in the mathematical continuum do the equalities A = B and B = Cimply the equality A = C . In the observable physical continuum, ‘equal’ means‘indistinguishable’, and A = B and B = C by no means imply A = C .” The rawresult of experience may be expressed by the relation A = B , B = C , A < C ,which may be regarded as the formula for the physical continuum.“ According toPoincaré, physical equality is a non-transitive relation.” ([38], p. 178.) Menger sug-gested a realistic description of the equality of elements in the physical continuumby associating with each pair (A, B) of these elements the probability that A andB will be found to be indistinguishable. He argued: “For it is only very likely thatA and B are equal, and very likely that B and C are equal – why should it not beless likely that A and C are equal? In fact, why should the equality of A and C notbe less likely than the inequality of A and C?” ([38], p. 178.) To solve “Poincaré’sparadox” Menger used his concept of probabilistic relations and geometry: For theprobability E(a, b) that a and b would be equal he postulated:

E(a, a) = 1 for every a;E(a, b) = E(b, a), for every a and b;E(a, b) · E(b, c) ≤ E(a, c), for every a, b, c.

If E(a, b) = 1, then he called a and b certainly equal. (In this case we obtain theordinary equality relation.) “All the elements which are certainly equal to a maybe united to an ‘equality set’, A. Any two such sets are disjoint unless they areidentical.” ([38], p. 179.)

In 1951, as a visiting lecturer at the Sorbonne University, Menger presentedsimilar ideas in the May session of the French Académie des sciences in his “En-sembles flous et functions aléatoires”. He proposed to replace the ordinary elementrelation “∈” between each object x in the universe of discourse U and a set F bythe probability ΠF(x) of x belonging to F . In contrast to ordinary sets, he calledthese entities “ensembles flous”: “Une relation monaire au sens classique est unsous-ensemble F de l’univers. Au sens probabiliste, c’est une fonction ΠF définiepour tout x ∈ U . Nous appellerons cette fonction même un ensemble flou et nousinterpréterons Π F(x) comme la probabilité que x appartienne à cet ensemble.” ([39],p. 2002) Later, he also used the English expression “hazy sets” [40].

3 In his original paper Menger wrote “>”. I would like to thank E. P. Klement for this correction.

Page 77: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 67

3.3 Hazy Lumps and Micro Geometry

Almost 15 years later, in 1966, Menger read a paper on “Positivistic Geometry” atthe Mach symposium that was held by the American Association for the Advance-ment of Science to mark the 50th anniversary of the death of the Viennese physicistand philosopher Ernst Mach (1838–1916) (Fig. 5, left side). The very important roleof Mach’s philosophical thinking and his great influence on Vienna Circle philoso-phy are obvious from the initial name of this group of scholars in the first third of the20th century – “Verein (Society) Ernst Mach”. In his contribution to the Mach sym-posium in the USA, Menger began with the following passage from Mach’s chapteron the continuum in his book Die Principien der Wäremelehre (Principles of theTheory of Heat), published in 1896: “All that appears to be a continuum might verywell consist of discrete elements, provided they were sufficiently small comparedwith our smallest practically applicable units or sufficiently numerous.” [41] Thenhe again described Poincaré’s publications connected with the notion of the physi-cal continuum and he summarized his own work on statistic metrics, probabilisticdistances, indistinguishableness of elements, and ensembles flous. He believed thatit would be important in geometry to combine these concepts with the concept “oflumps, which can be more easily identified and distinguished than points. More-over, lumps admit an intermediate stage between indistinguishability and apartness,namely that of overlapping. It is, of course, irrelevant whether the primitive (i.e.undefined) concepts of a theory are referred to as points and probably indistinguish-able, or as lumps and probably overlapping. All that matters are the assumptionsmade about the primitive concepts. But the assumptions made in the two caseswould not be altogether identical and I believe that the ultimate solution of problemsif micro geometry may well lie in a probabilistic theory of hazy lumps. The essentialfeature of this theory would be that lumps would not be point sets; nor would they re-flect circumscribed figures such as ellipsoids. They would rather be in mutual proba-bilistic relations of overlapping and apartness, from which a metric would have to bedeveloped.” [40]

Menger reminded his readers of “another example of distances defined bychains” that is a more detailed representation of what Russell had mentioned morethan 40 years before: In the 1920s, the well-known Viennese physicist ErwinSchrödinger (1887–1961) – who was later one of the founders of quantum mechan-ics – established a theory of a metric of colors, called Colorimetry (Farbenmetrik)(Fig. 4): “Schrödinger sets the distance between two colors C and C ′, equal to 1if C and C ′ are barely distinguishable; and equal to the integer n if there exists asequence of elements, C0 = C, C1, C2, . . . , Cn−1 = C ′ such that any two consec-utive elements Ci−1 and Ci are barely distinguishable while there does not existany shorter chain of this kind. According to Schrödinger’s assumption about thecolor cone, each element C (represented by a vector in a 3-dimensional space) is thecenter of an ellipsoid whose interior points cannot be distinguished from C whilethe points on the surface of the ellipsoid are barely distinguishable from C . Butsuch well-defined neighborhoods do not seem to exist in the color cone any morethat on the human skin. One must first define distance 1 in a probabilistic way.”([42], p. 232).

Page 78: Forging New Frontiers: Fuzzy Pioneers I

68 R. Seising

Fig. 4 Schrödinger’s colorcone ([42], p. 428)

Menger never envisaged a mathematical theory of loose concepts that differsfrom probability theory. He compared his “micro geometry” with the theory offuzzy sets: “In a slightly different terminology, this idea was recently expressedby Bellman, Kalaba and Zadeh under the name fuzzy set. (These authors speakof the degree rather than the probability of an element belonging to a set.)” [40]Menger did not see that this “slight difference” between “degrees” (fuzziness) and“probabilities” is a difference not just in terminology but in the meaning of theconcepts.

Ludwik Fleck and Karl Menger were philosophers of vagueness in the first halfof the 20th century, each one with a specific field of study. Fleck’s field of researchwas physicians’ specific style of thinking. He contemplated a “space of phenomenaof disease” and realized that there are no clear boundaries either between these phe-nomena or between what is diseased and what is healthy. Fleck therefore called themedical way of thinking a “specific style” that is not in accordance with deductivelogic. Today we would say that this way of thinking is essentially fuzzy.

The mathematician Menger looked for a tool to deal with elements of the physicalcontinuum, which is different from the mathematical continuum, because these el-ements can be indistinguishable but not identical. To that end, Menger generalizedthe theory of metric spaces, but he could not break away from probability theoryand statistics. Thus, Menger created a theory of probabilistic distances of points orelements of the physical continuum. Both of these scholars concerned themselveswith tools to deal with fuzziness before the theory of fuzzy sets came into being.

Fig. 5 Ernst Mach, Jules Henri Poincaré, Karl Menger

Page 79: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 69

4 Fuzzy Sets – More Realistic Mathematics

4.1 Thinking Machines

In the late 1940s and early 1950s information theory, the mathematical theory ofcommunication, and cybernetics developed during the Second World War by ClaudeE. Shannon (1916–2001) (Fig. 6, second from left), Norbert Wiener (1894–1964)(Fig. 6, left side), Andrej N. Kolmogorov (1903–1987), Ronald A. Fisher (1890–1962) and many others became well-known. When Shannon and Wiener went toNew York to give lectures on their new theories at Columbia University in 1946, theyintroduced these new milestones in science and technology to the young doctoralstudent of electrical engineering Lotfi A. Zadeh.

Also the new era of digital computers – which began in the 1940s with theElectronic Numerical Integrator and Computer (ENIAC) and the Electronic Dis-crete Variable Computer (EDVAC), both of which were designed by J. P. Eckert(1919–1995) and J. W. Mauchly (1907–1980), and continuing developments by sci-entists such as John von Neumann and others – gave huge stimuli to the field ofengineering. In 1950 Zadeh moderated a discussion between Shannon, Edmund C.Berkeley (1923–1988, author of the book Giant Brains or Machines That Think [43]published in 1949), and the mathematician and IBM consultant, Francis J. Murray(1911–1996). “Can machines think?” was Alan Turing’s (Fig. 6, second from right)question in his famous Mind article “Computing Machinery and Intelligence” in thesame year [44]. Turing proposed the imitation game, now called the “Turing Test”,to decide whether a computer or a program could think like a human being or not.Inspired by Wiener’s Cybernetics [45], Shannon’s Mathematical Theory of Com-munication [46], and the new digital computers, Zadeh wrote the article “ThinkingMachines – A New Field in Electrical Engineering” (Fig. 7) in the student journalThe Columbia Engineering Quarterly in 1950 [47].

As a prelude, he quoted some of the headlines that had appeared in newspapersthroughout the USA during 1949: “Psychologists Report Memory is Electrical”,“Electric Brain Able to Translate Foreign Languages is Being Built”, ElectronicBrain Does Research”, “Scientists Confer on Electronic Brain,” and he asked, “Whatis behind these headlines? How will “electronic brains” or “thinking machines”

Fig. 6 Norbert Wiener, Claude E. Shannon, Alan M. Turing, Lotfi A. Zadeh

Page 80: Forging New Frontiers: Fuzzy Pioneers I

70 R. Seising

Fig. 7 Headline of L. A. Zadeh’s 1950-paper [47]

affect our way of living? What is the role played by electrical engineers in the designof these devices?” ([47], p. 12.)

Zadeh was interested in “the principles and organization of machines which be-have like a human brain. Such machines are now variously referred to as ‘thinkingmachines’, ‘electronic brains’, ‘thinking robots’, and similar names.” In a footnotehe added that the “same names are frequently ascribed to devices which are not‘thinking machines’ in the sense used in this article” and specified that “The distin-guishing characteristic of thinking machines is the ability to make logical decisionsand to follow these, if necessary, by executive action.” ([47], p. 12) In the article,moreover, Zadeh gave the following definition: “More generally, it can be said thata thinking machine is a device which arrives at a certain decision or answer throughthe process of evaluation and selection.”

On the basis of this definition, he decided that the MIT differential analyzer wasnot a thinking machine, but that both of the large-scale digital computers that hadbeen built at that time, UNIVAC and BINAC, were thinking machines because theyboth were able to make non-trivial decisions. ([47], p. 13) Zadeh explained “how athinking machine works” (Fig. 8) and stated that “the box labeled Decision Makeris the most important part of the thinking machine”.

Lotfi Zadeh moved from Tehran, Iran, to the USA in 1944. Before that, he hadreceived a BS degree in Electrical Engineering in Tehran in 1942. After a while,he continued his studies at MIT where he received an MS degree in 1946. Hethen moved to New York, where he joined the faculty of Columbia University asan instructor. In 1949 he wrote his doctoral dissertation on Frequency Analysis ofVariable Networks [48], under the supervision of John Ralph Ragazzini, and in 1950he became an assistant professor at Columbia. The following sections of this article

Fig. 8 Zadeh’s schematicdiagram illustrating thearrangement of the basicelements of a thinkingmachine ([47], p. 13)

Page 81: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 71

Fig. 9 Signal spacerepresentation: comparison ofthe distances between thereceived signal y and allpossible transmitted signals([49], p. 202)

deal with Zadeh’s contributions to establishing a new field in science that is nowconsidered to be a part of research in artificial intelligence: soft computing, alsoknown as computational intelligence. This development starts with Zadeh’s work ininformation and communication technology.

Information and Communication Science

In March 1950 Zadeh gave a lecture on Some Basic Problems in Communicationof Information at the New York Academy of Sciences [49], in which he representedsignals as ordered pairs (x(t), y(t)) of points in a signal space Σ , which is embeddedin a function space (Fig. 9). The first problem deals with the recovery process oftransmitted signals:

“Let X = {x(t)} be a set of signals. An arbitrarily selected member of this set,say x(t), is transmitted through a noisy channel Γ and is received as y(t). As a resultof the noise and distortion introduced by Γ , the received signal y(t) is, in general,quite different from x(t). Nevertheless, under certain conditions it is possible to

Fig. 10 Geometrical representation of nonlinear (left) and linear filtering (right) ([49], p. 202)

Page 82: Forging New Frontiers: Fuzzy Pioneers I

72 R. Seising

recover x(t) – or rather a time-delayed replica of it – from the received signal y(t).”([49], p. 201)

In this paper, he did not examine the case where {x(t)} is an ensemble; he re-stricted his view to the problem of recovering x(t) from y(t) “irrespective of thestatistical character of {x(t)}” ([49], p. 201). Corresponding to the relation y = Γ xbetween signals x(t) and y(t), he represented the recovery process of y(t) from x(t)by x = Γ −1 y, where Γ −1 y is the inverse of Γ , if it exists, over {y(t)}.

Zadeh represented signals as ordered pairs of points in a signal space Σ , whichis embedded in a function space with a delta-function basis. To measure the dispar-ity between x(t) and y(t), he attached a distance function d(x, y) with the usualproperties of a metric. Then he considered the special case in which it is possibleto achieve a perfect recovery of the transmitted signal x(t) from the received signaly(t). He supposed that “X = {x(t)} consist of a finite number of discrete signalsx1(t), x2(t), . . . , xn(t), which play the roles of symbols or sequences of symbols.The replicas of all these signals are assumed to be available at the receiving endof the system. Suppose that a transmitted signal xk is received as y. To recoverthe transmitted signal from y, the receiver evaluates the ‘distance’ between y andall possible transmitted signals x1, x2, . . . , xn , with the use of a suitable distancefunction d(x, y), and then selects that signal which is ‘nearest’ to y in terms of thisdistance function (Fig. 9). In other words, the transmitted signal is taken to be theone that results in the smallest value of d(x, y). This in brief, is the basis of thereception process.” ([49], p. 201)

In this process, the received signal xk is always ‘nearer’ – in terms of the dis-tance functional d(x, y) – to the transmitted signal y(t) than to any other possiblesignal xi .

Zadeh conceded that “in many practical situations it is inconvenient, or evenimpossible, to define a quantitative measure, such as a distance function, of thedisparity between two signals. In such cases we may use instead the concept ofneighborhood, which is basic to the theory of topological spaces.” ([49], p. 202) –About 15 years later, he proposed another ‘concept of neighborhood’ which is nowbasic to the theory of fuzzy systems.

He also discussed the multiplex transmission of two or more signals: A systemhas two channels and the sets of signals assigned to their respective channels areX = {x(t)} and Y = {y(t)}. If we are given the sum signal u(t) = x(t)+ y(t) at thereceiving end, how can we extract x(t) and y(t) from u(t)? – We have to find twofilters N1 and N2 such that, for any x in X and any y in Y ,

N1(x + y) = x and N2(x + y) = y.

In the signal space representation, two manifolds Mx and My in Σ correspondto the sets of signals, X and Y . Zadeh showed that the coordinates of the signalvector x(t) is computable in terms of the coordinates of the signal vector u(t) in thefollowing form:

xν = Hν(u1, u2, . . . , un), ν = 1, 2, . . . n

Page 83: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 73

Symbolically, he wrote x = H (u), where H (or equivalently, the components Hν)provides the desired characterization of the ideal filter.

Zadeh argued that all equations in all coordinates of the vectors x , y, and u “canbe solved by the use of machine computation or other means” and he stated ananalogy between filters and computers: “It is seen that, in general, a filter such asN1 may be regarded essentially as a computer which solves equations [. . . ] of x(t)in terms of those of u(t).” ([49], p. 204)

Finally, in this lecture Zadeh mentioned the case of linearity of the equationsfor fi and g j . “In this case the manifolds Mx and My are linear, and the operationperformed by the ideal filter is essentially that of projecting the signal space Σ onMx along My” ([49], p. 204). He illustrated both the nonlinear and the linear modesof ideal filtering in Fig. 10 in terms of two-dimensional signal space.

This analogy between the projection in a function space and the filtration by afilter led Zadeh in the early 1950s to a functional symbolism of filters [50, 51]. Thus,N = N1 + N2 represents a filter consisting of two filters connected by addition,N = N1 N2 represents their tandem combination and N = N1|N2 the separationprocess (Fig. 11).

But in reality filters don’t do exactly what they are supposed to do in theory.Therefore, in later papers Zadeh started (e.g. in [52]) to differentiate between idealand optimal filters. Ideal filters are defined as filters that achieve a perfect separationof signal and noise, but in practice such ideal filters do not exist. Zadeh regardedoptimal filters to be those that give the “best approximation” of a signal and henoticed that “best approximations” depend on reasonable criteria. At that time heformulated these criteria in statistical terms [52].

In the late 1950s Zadeh realized that there are many problems in applied science(primarily those involving complex, large-scale systems, or complex or animatedsystems) in which we cannot compute exact solutions and therefore we have to becontent with unsharp – fuzzy – solutions. To handle such problems, reasoning withloose concepts turned out to be very successful and in the 1960s Zadeh developeda mathematical theory to formalize this kind of reasoning, namely the theory offuzzy sets [11, 12, 13]. During this time Zadeh was in constant touch with his close

Fig. 11 Functionalsymbolism of ideal filters([51], p. 225)

Page 84: Forging New Frontiers: Fuzzy Pioneers I

74 R. Seising

friend and colleague Richard E. Bellman (1920–1984) (Fig. 12, left side), a youngmathematician working at the RAND Corporation in Santa Monica. This friendshipbegan in the late 1950s, when they met for the first time in New York while Zadehwas at Columbia University and ended at Bellman’s death. Even though they dealtwith different, special problems in mathematical aspects of electrical engineering,system theory, and later computer science, they met each other very often and dis-cussed many subjects involved in their scientific work. Thus, it is not surprising thatthe history of fuzzy set theory is closely connected with Richard Bellman.

Fuzzy Sets and Systems

Zadeh came up with the idea of fuzzy sets in the summer of 1964. He and Bellmanhad planned to work together at RAND in Santa Monica. Before that, Zadeh had togive a talk on pattern recognition at a conference at the Wright Patterson Air ForceBase in Dayton, Ohio. During that time he started thinking about the use of gradesof membership for pattern classification and he conceived the first example of fuzzymathematics, which he wrote in one of his first papers on the subject: “For example,suppose that we are concerned with devising a test for differentiating between thehandwritten letters O and D. One approach to this problem would be to give a setof handwritten letters and indicate their grades of membership in the fuzzy sets Oand D. On performing abstraction on these samples, one obtains the estimates μ0and μD of μ0 and μD , respectively. Then given a letter x which is not one of thegiven samples, one can calculate its grades of membership in O and D; and, if Oand D have no overlap, classify x in O or D.” ([12], p. 30.)

Over the next few days he extended this idea to a preliminary version of thetheory of fuzzy sets, and when he got to Santa Monica, he discussed these ideas withBellman. Then he wrote two manuscripts: “Abstraction and Pattern Classification”and “Fuzzy Sets”. The authors of the first manuscript were listed as Richard Bell-man, his associate Robert Kalaba (1926–2004) (Fig. 12, middle), and Lotfi Zadeh(Fig. 12, right side). It was printed as RAND memorandum RM-4307-PR in October1964 and was written by Lotfi Zadeh (Fig. 13). In it he defined fuzzy sets for the

Fig. 12 Richard E. Bellman, Robert E. Kalaba, and Lotfi A. Zadeh

Page 85: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 75

first time in a scientific paper, establishing a general framework for the treatment ofpattern recognition problems [52]. This was an internal RAND communication, butZadeh also submitted the paper to the Journal of Mathematical Analysis and Appli-cations, whose editor was none other than Bellman. It was accepted for publication,under the original title, in 1966 [53]. This was the article that Karl Menger cited inhis contribution at the Mach symposium in 1966.

The second manuscript that Zadeh wrote on fuzzy sets was called “FuzzySets” [11]. He submitted it to the editors of the journal Information and Control inNovember 1964 (Fig. 15, left side). As Zadeh himself was a member of the editorialboard of this journal, the reviewing process was quite short. “Fuzzy Sets” appearedin June 1965 as the first article on fuzzy sets in a scientific journal.

As was usual in the Department of Electrical Engineering at Berkeley, this textwas also preprinted as a report and this preprint appeared as ERL Report No. 64–44of the Electronics Research Laboratory in November 1964 [53].

In this seminal article Zadeh introduced new mathematical entities as classes orsets that “are not classes or sets in the usual sense of these terms, since they do notdichotomize all objects into those that belong to the class and those that do not.”He introduced “the concept of a fuzzy set, that is a class in which there may be a

Fig. 13 Zadeh’s RAND memo [52]

Page 86: Forging New Frontiers: Fuzzy Pioneers I

76 R. Seising

Fig. 14 Zadeh’s Illustration of fuzzy sets in R1: “The membership function of the union is com-prised of curve segments 1 and 2; that of the intersection is comprised of segments 3 and 4 (heavylines).” ([11], p. 342.)

continuous infinity of grades of membership, with the grade of membership of anobject x in a fuzzy set A represented by a number fA(x) in the interval [0, 1].” [11]

The question was how to generalize various concepts, union of sets, intersec-tion of sets, and so forth. Zadeh defined equality, containment, complementation,intersection and union relating to fuzzy sets A, B in any universe of discourse X asfollows (for all x ∈ X) (Fig. 14):

A = B if and only if μA(x) = μB(x),

A ⊆ B if and only if μA(x) ≤ μB(x),

¬A is the complement of A, if and only if μ¬A(x) = 1– μA(x),

A ∪ B if and only if μA∪B(x) = max(μA(x), μB(x)),

A ∩ B if and only if μA∩B(x) = min(μA(x), μB(x)).

Later, the operations of minimum and maximum of membership functions couldbe identified as a specific t-norm and t-conorm, respectively. These algebraicconcepts that Karl Menger introduced into mathematics in connection with sta-tistical metrics in the early 1940s are now an important tool in modern fuzzymathematics.

In April 1965 the Symposium on System Theory was held at the PolytechnicInstitute in Brooklyn. Zadeh presented “A New View on System Theory”: a viewthat deals with the concepts of fuzzy sets, “which provide a way of treating fuzzinessin a quantitative manner.” He explained that “these concepts relate to situations inwhich the source of imprecision is not a random variable or a stochastic process butrather a class or classes which do not possess sharply defined boundaries.” [12] His“simple” examples in this brief summary of his new “way of dealing with classesin which there may be intermediate grades of membership” were the “class” of realnumbers which are much larger than, say, 10, and the “class” of “bald men”, as wellas the “class” of adaptive systems. In the subsequent publication of the proceedingsof this symposium, there is a shortened manuscript version of Zadeh’s talk, which

Page 87: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 77

Fig. 15 Zadeh’s first published papers on fuzzy sets in 1965 [11] and [12]

here is entitled “Fuzzy Sets and Systems”. [12] (Fig. 15, right side). In this lectureand published paper, Zadeh first defined “fuzzy systems” as follows:

A system S is a fuzzy system if (input) u(t), output y(t), or state s(t) of S or anycombination of them ranges over fuzzy sets. ([12], p. 33).

We have the same systems equations that hold for usual systems, but with differ-ent meanings for u, y, and s as specified in this definition.

st+1 = f (st , ut ), yt = g(st , ut ), t = 0, 1, 2, . . . .

He maintained that these new concepts provide a “convenient way of definingabstraction – a process which plays a basic role in human thinking and communica-tion.” ([12], p. 29)

5 Towards Fuzzy Sets in Artificial Intelligence

The theory of fuzzy sets is a mathematical theory to deal with vagueness and otherloose concepts lacking strict boundaries. It seems that “vagueness”, as it has beenused in philosophy and logic since the 20th century, may be formalized by fuzzysets, whereas “haziness” like other scientific concepts, e.g. indeterminacy, is a con-cept that needs to be formalized by probability theory and statistics. Nevertheless,fuzzy mathematics cannot possibly be imagined without the use of t-norms andt-conorms.

The theory of fuzzy sets is considered to be the core of “soft computing”, whichbecame a new dimension of artificial intelligence in the final years of the 20th cen-tury. Zadeh was inspired by the “remarkable human capability to perform a wide va-riety of physical and mental tasks without any measurements and any computations.Everyday examples of such tasks are parking a car, playing golf, deciphering sloppyhandwriting and summarizing a story. Underlying this capability is the brain’s cru-cial ability to reason with perceptions – perceptions of time, distance, speed, force,direction, shape, intent, likelihood, truth and other attributes of physical and mentalobjects.” ([54], p. 903).

Page 88: Forging New Frontiers: Fuzzy Pioneers I

78 R. Seising

Fig. 16 Perception-basedsystem modelling [57]

In the 1990s he established Computing with Words (CW) [55, 56] instead ofexact computing with numbers, as a method for reasoning and computing withperceptions. In an article entitled “Fuzzy Logic = Computing with Words” hestated that “the main contribution of fuzzy logic is a methodology for comput-ing with words. No other methodology serves this purpose” ([55], p. 103.). Threeyears later he wrote “From Computing with Numbers to Computing with Words –From Manipulation of Measurements to Manipulation of Perceptions”, to show thata new “computational theory of perceptions, or CTP for short, is based on themethodology of CW. In CTP, words play the role of labels of perceptions and,more generally, perceptions are expressed as propositions in natural language.”([56], p. 105).

Zadeh had observed “that progress has been, and continues to be, slow in thoseareas where a methodology is needed in which the objects of computation are per-ceptions – perceptions of time, distance, form, direction, color, shape, truth, likeli-hood, intent, and other attributes of physical and mental objects.” ([57], p. 73.) Heset out his ideas in the AI Magazine in the spring of 2001. In this article called “ANew Direction in AI” he presented perception-based system modelling (Fig. 16):“A system, S, is assumed to be associated with temporal sequences of input X1,X2, . . . ; output Y1, Y2, . . . ; and states S1, S2, . . . is defined by state-transition func-tion f with St+1 = f (St , Xt ), and output function g with Yt = g(St , Xt ), fort = 0, 1, 2, . . . . In perception-based system modelling, inputs, outputs and states areassumed to be perceptions, as state-transition function, f , and output function, g.”([57], p. 77.)

In spite of its advances, artificial intelligence cannot compete with human in-telligence on many levels and will not be able to do so in the very near future.Nonetheless, CTP, CW, and Fuzzy Sets enable computers and human beings to com-municate in terms that allow them to express uncertainty regarding measurements,evaluations, and (medical) diagnostics, etc. In theory, this should put the methodsof information and communication science used by machines and human beings onlevels that are much closer to each other.

Acknowledgments The author would like to thank Professor Dr. Lotfi A. Zadeh (Berkeley, Cali-fornia) for his generous help and unstinted willingness to support the author’s project on the historyof the theory of fuzzy sets and systems and also Professor Dr. Erich-Peter Klement (Linz, Austria)for his generous help and advice on Karl Menger’s role in the history of fuzzy sets. Many thanksare due to Jeremy Bradley for his advice and help in the preparation of this contribution.

Page 89: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 79

References

1. R. Seising, “Fuzziness before Fuzzy Sets: Two 20th Century Philosophical Approaches toVagueness - Ludwik Fleck and Karl Menger”. Proceedings of the IFSA 2005 World Congress,International Fuzzy Systems Association, July 28–31, 2005, Beijing, China, CD.

2. R. Seising, “Vagueness, Haziness, and Fuzziness in Logic, Science, and Medicine – Beforeand When Fuzzy Logic Began”. BISCSE’05 Forging New Frontiers. 40th of Fuzzy Pio-neers (1965–2005), BISC Special Event in Honour of Prof. Lotfi A. Zadeh, MemorandumNo. UCB/ERL M05/31, November 2, 2005, Berkeley Initiative in Soft Computing (BISC),Computer Science Division, Department of Electrical Engineering and Computer Sciences,University of Berkeley, Berkeley, California, USA.

3. R. Seising, The 40th Anniversary of Fuzzy Sets – New Views on System Theory. InternationalJournal of General Systems, Special Issue “Soft Computing for Real World Applications, 2006(in review).

4. R. Seising, Optimality, Noninferiority, and the Separation Problem in Pattern Classification:Milestones in the History of the Theory of Fuzzy Sets and Systems – The 40th Anniversary ofFuzzy Sets. IEEE Transactions on Fuzzy Systems 2006 (in preparation).

5. R. Seising, From Filters and Systems to Fuzzy Sets and Systems – The 40th Anniversary ofFuzzy Sets. Fuzzy Sets and Systems 2006 (in preparation).

6. R. Seising, Roots of the Theory of Fuzzy Sets in Information Sciences – The 40th Anniversaryof Fuzzy Sets. Information Sciences 2006 (in preparation).

7. R. Seising, From Vagueness in the Medical Way of Thinking to Fuzzy Reasoning Foundationsof Medical Diagnoses. Artificial Intelligence in Medicine, (2006) 38, 237–256.

8. T. Kortabinski, Elementy teorii poznania, logiki formalnej i metodologii nauk, Ossolineum,Lwów, 1929.

9. K. Ajdukiewicz, On the problem of universals, Przeglad Filozoficzny, Vol. 38, 1935, 219–234.10. L. A. Zadeh, From Circuit Theory to System Theory. Proceedings of the IRE, Vol. 50, 1962,

856–865.11. L. A. Zadeh, Fuzzy Sets, Information and Control, 8, 1965, 338–353.12. L. A. Zadeh, Fuzzy Sets and Systems. In: Fox J (ed): System Theory. Microwave Research

Institute Symposia Series XV, Polytechnic Press, Brooklyn, New York, 1965, 29–37.13. L. A. Zadeh, R. E. Bellman, R. E. Kalaba, Abstraction and Pattern Classification, Journal of

Mathematical Analysis and Applications 13, 1966, 1–7.14. R. Seising, “Noninferiority, Adaptivity, Fuzziness in Pattern Separation: Remarks on the

Genesis of Fuzzy Sets”. Proceedings of the Annual Meeting of the North American FuzzyInformation Processing Society, Banff, Alberta, Canada: Fuzzy Sets in the Hearth ofthe Canadian Rockies 2004 (NAFIPS 2004), June 27–30, 2004, Banff, Alberta, Canada,2002–2007.

15. R. Seising, “40 years ago: Fuzzy Sets is going to be published”, Proceedings of the 2004IEEE International Joint Conference on Neural Networks / International Conference on FuzzySystems (FUZZ-IEEE 2004), July 25–29, 2004, Budapest, Hungary, CD.

16. R. Seising, “1965 – “Fuzzy Sets” appear – A Contribution to the 40th Anniversary”. Proc. ofthe Conference FUZZ-IEEE 2005, Reno, Nevada, May 22–25, 2005, CD.

17. R. Seising, “The 40th Anniversary of Fuzzy Sets – A New View on System Theory”. YingH, Filev D (eds) Proc. of the NAFIPS Annual Conference, Soft Computing for Real WorldApplications, 22–25 June, 2005, Ann Arbor, Michigan, USA, 92–97.

18. R. Seising, “On the fuzzy way from “Thinking Machines” to “Machine IQ””, Proceedingsof the IEEE International Workshop on Soft Computing Applications, IEEE – SOFA 2005,27–30 August, 2005, Szeged-Hungary and Arad-Romania, 251–256.

19. R. Seising, Die Fuzzifizierung der Systeme. Die Entstehung der Fuzzy Set Theorie und ihrerersten Anwendungen - Ihre Entwicklung bis in die 70er Jahre des 20. Jahrhunderts, FranzSteiner Verlag, Stuttgart, 2005.

20. G. Frege, Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinenDenkens, Halle, 1879.

Page 90: Forging New Frontiers: Fuzzy Pioneers I

80 R. Seising

21. G. Frege, Funktion und Begriff, 1891. In: Patzig G (ed) Frege G (1986) Funktion, Begriff,Bedeutung, Vandenheoeck & Ruprecht, Göttingen, 18–39.

22. G. Frege, Grundgesetze der Arithmetik. 2 vols., Hermann Pohle, Jena, 1893–1903.23. B. Russell, Vagueness, The Australasian Journal of Psychology and Philosophy, 1, 1923,

84–92.24. M. Black, Vagueness. An exercise in logical analysis, Philosophy of Science, 4, 1937,

427–455.25. M. Black, Reasoning with loose concepts, Dialogue, 2, 1963, 1–12.26. L. Fleck, O niektórych swoistych ceceach mysslekarskiego, Archiwum Historji i Filo-zofji

Medycyny oraz Historji Nauk Przyrodniczych, 6, 1927, 55–64. Engl. Transl.: Ludwik Fleck:Some specific features of the medical way of thinking. In: [23], 39–46.

27. L. Fleck, Entstehung und Entwicklung einer wissenschaftlichen Tatsache. Einführung in dieLehre vom Denkstil und Denkkollektiv, Schwabe & Co, Basel, 1935.

28. T. J. Trenn, R. K. Merton, Genesis and Development of a Scientific Fact, University of ChicagoPress, Chicago/London, 1979.

29. T. S. Kuhn, The Structure of Scientific Revolutions, University of Chicago Press, Chicago,1962.

30. R. S. Cohen, T. Schnelle (eds), Cognition and Fact. Materials on Ludwik Fleck, D. ReidelPubl. Comp. Dordrecht, Boston, Lancaster, Tokyo, 1986, 39–46.

31. L. A. Zadeh, Biological Applications of the Theory of Fuzzy Sets and Systems. In: The Pro-ceedings of an International Symposium on Biocybernetics of the Central Nervous System.Little, Brown and Company: Boston, 1969, 199–206.

32. R. Seising, From Vagueness in the Medical Way of Thinking to Fuzzy Reasoning Founda-tions of Medical Diagnosis. Artificial Intelligence in Medicine, Special Issue: Fuzzy Sets inMedicine (85th birthday of Lotfi Zadeh), October 2006 (in review).

33. K. Menger, Memories of Moritz Schlick. In: E. T. Gadol (ed), Rationality and Science. AMemorial Volume for Moritz Schlick in Celebration of the Centennial of his Birth. Springer,Vienna, 1982, 83–103.

34. K. Menger, The New Logic: A 1932 Lecture. In: Selected Papers in Logics and Foundationsin Didactics, Economics, Vienna Circle Collection, Volume 10, D. Reidel, Dordrecht, 1979.

35. R. Carnap, Logische Syntax der Sprache, Springer, Wien, 1934. (English edition: R. Carnap,Logical Syntax of Language, Humanities, New York.)

36. K. Menger, Statistical Metrics, Proceedings of the National Academy of Sciences, 28, 1942,535–537.

37. K. Menger, Probabilistic Geometry, Proceedings of the National Academy of Sciences, 37,1951, 226–229.

38. K. Menger, Probabilistic Theories of Relations, Proceedings of the National Academy of Sci-ences, 37, 1951, 178–180.

39. K. Menger, Ensembles flous et fonctions aléatoires. Comptes Rendus Académie des Sciences,37, 1951, 2001–2003.

40. K. Menger, Geometry and Positivism. A Probabilistic Microgeometry. In: K. Menger (ed.)Selected Papers in Logic and Foundations, Didactics, Economics, Vienna Circle Collection,10, D. Reidel Publ. Comp., Dordrecht, Holland, 1979, 225–234.

41. E. Mach, Die Principien der Wärmelehre. Historisch-kritisch entwickelt. Leipzig, Verlag vonJohann Ambrosius Barth, 1896.

42. E. Schrödinger, Grundlinien einer Theorie der Farbenmetrik im Tagessehen. Annalen derPhysik, 63, 4, 1920, 397–456, 481–520.

43. E. C. Berkeley, Giant Brains or Machines that Think. John Wiley & Sons, Chapman & Hall,New York, London, 1949.

44. A. M. Turing, Computing Machinery and Intelligence, Mind, LIX, 236, 1950, 433–460.45. N. Wiener, Cybernetics or Control and Communication in the Animal and the Machine,

Hermann & Cie., The Technology Press, and John Wiley & Sons, Cambridge-Mass., NewYork 1948.

46. C. E. Shannon, A Mathematical Theory of Communication, Bell System Technical Journal,27, 379–423 and 623–656.

Page 91: Forging New Frontiers: Fuzzy Pioneers I

Pioneers of Vagueness, Haziness, and Fuzziness in the 20th Century 81

47. L. A. Zadeh, Thinking Machines – A New Field in Electrical Engineering, Columbia Eng.Quarterly, Jan. 1950, 12–13, 30–31.

48. L. A. Zadeh, Frequency Analysis of Variable Networks, Proceedings of the IRE, March, 38,1950, 291–299.

49. L. A. Zadeh, Some Basic Problems in Communication of Information. In The New YorkAcademy of Sciences, Series II, 14, 5, 1952, 201–204.

50. L. A. Zadeh, K. S. Miller, Generalized Ideal Filters. Journal of Applied Physics, 23, 2, 1952,223–228.

51. L. A. Zadeh, Theory of Filtering, Journal of the Society for Industrial and Applied Mathemat-ics, 1, 1953, 35–51.

52. R. E. Bellman, R. E. Kalaba, L. A. Zadeh, Abstraction and Pattern Classification, Memoran-dum RM- 4307-PR, Santa Monica, California: The RAND Corporation, October, 1964.

53. L. A. Zadeh, Fuzzy Sets, ERL Report No. 64-44, University of California at Berkeley, Novem-ber 16, 1964.

54. L. A. Zadeh, The Birth and Evolution of Fuzzy Logic – A Personal Perspective, Journal ofJapan Society for Fuzzy Theory and Systems, 11, 6, 1999, 891–905.

55. L. A. Zadeh, Fuzzy Logic = Computing with Words, IEEE Transactions on Fuzzy Systems, 4,2, 1996, 103–111.

56. L. A. Zadeh, From Computing with Numbers to Computing with Words – From Manipulationof Measurements to Manipulation of Perceptions, IEEE Trans. on Circuits And Systems-I:Fundamental Theory and Applications, 45, 1, 1999, 105–119.

57. L. A. Zadeh, A New Direction in AI. Toward a Computational Theory of Perceptions.AI-Magazine, 22, 1, 2001, 73–84.

Page 92: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Surveyon Research, Instruction and DevelopmentWork with Fuzzy Systems

Vesa A. Niskanen

Abstract Selected results of the survey on research, instruction and developmentwork with fuzzy systems are presented. This study was carried out by the requestof the International Fuzzy Systems Association (IFSA). Electronic questionnaireforms were sent to the IFSA Societies, relevant mailing lists and central persons infuzzy systems, and 166 persons from 36 countries filled our form. Since our dataset was relatively small, we could only draw some general guidelines according tothis data, and their statistical analysis is presented below. However, first, studiesof this type can support the future research design and work on fuzzy systems byrevealing the relevant research areas and activities globally. Second, we can supportthe future comprehensive educational planning on fuzzy systems globally. Third, wecan promote better the marketing of fuzzy systems for decision makers, researchers,students and customers. Fourth, we can support the future global networking ofpersons, institutes and firms in the area of fuzzy systems.

1 Background

Since Lotfi Zadeh’s invention of fuzzy systems in 1965, numerous books and arti-cles are published and a lot of applications are produced in this field. In addition,numerous university-level courses on fuzzy systems have been held.

However, we still lack exact numbers concerning publications, applications, re-searchers and fuzzy courses even though they are relevant for planning and decisionmaking in both the business life and non-corporate public services. In addition, ifthese numbers are sufficiently convincing, they can well promote further global dis-semination of fuzzy systems in the future.

For example, it seems that there are already over 40 000 publications and thou-sands of patents on fuzzy systems as well as thousands of researchers globally.

Vesa A. NiskanenDepartment of Economics & Management, University of Helsinki, PO Box 27, 00014 Helsinki,Finlande-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 83C© Springer 2007

Page 93: Forging New Frontiers: Fuzzy Pioneers I

84 V.A. Niskanen

Hence, we need global surveys on these activities in order to achieve the forego-ing aims.

The author has already designed and performed a global survey of this type bythe request of the International Fuzzy Systems Association (IFSA) in 2003–2005,and some results were presented in the IFSA World Congress in Beijing in 2005.However, since only 166 persons responded to our questionnaire, we are unable todraw any general conclusions according to this data set.

Hence, below we mainly present certain descriptive results and we also drawup some methodological guidelines for performing a global survey on the fuzzyactivities in research, instruction and development work. We expect that the con-siderations below can provide a basis for further studies on this important subjectmatter. Section 2 provides our design and basic characteristics of a survey. Section 3presents the results. Section 4 provides a summary of our research.

2 Basic Characteristics of a Survey

A survey usually aims to describe systematically and in an exact manner a phe-nomenon, a group of objects or a subject matter at a particular point in time. Surveyscan also determine the relationships that exist between the specific events [1].

In the survey research design we establish our aims and possible hypotheses.We also acquaint ourselves with the previous similar studies and necessary analysismethods as well as make plans for the data collection. Practical matters include taskssuch as scheduling and financing.

In our case we can establish these aims:

1. A world-wide causal-comparative survey and examination on fuzzy systemsin scientific research, university-level instruction and commercial products.

2. Establishment of aims and guidelines for future studies and instruction on fuzzysystems in the Globe.

The data collection is based on structured questionnaire forms, and we can usethem in interviews or in postal questionnaires (today electronic forms and the In-ternet seem to replace the traditional postal forms). In addition, we can collect datawith standardized tests. A census study seemed more usable in this context thanthe sample survey although the size of the population was obviously large. Thisapproach was due to the fact that the population structure and size were unknownand thus we were unable to design a reasonable sampling. The author designedan electronic structured questionnaire form, and this was distributed to the IFSASocieties, the BISC mailing list and to some key persons in 2004–2005 [4].

The surveys usually proceed through well-defined stages, and we thus proceedfrom our problem-setting to conclusions and a research report. If we assume fourprincipal stages in our study, research design is first carried out. Second, data col-lection is performed. Third, our data is examined. Fourth, conclusions are providedand a report is published.

Page 94: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 85

If we focus on the nature of our variables as well as we consider the frequenciesof our observations, we carry out a descriptive survey. If we also perform com-parisons between the classes in our data or consider interrelationships between thevariables, we perform a comparative survey. Finally, if we aim to explain or inter-pret the interrelationships between the variables by mainly focusing on their causalconnections, we are dealing with a causal-comparative survey. We mainly adoptedthe third approach.

Our data was examined with statistical methods [2, 5]. In this context variousmethods and tools, such as chi-square analysis, correlation analysis, t-tests, analysisof variance and regression analysis, can be used. Today we can also replace someof these methods with Soft Computing modeling [3]. Below we only sketch someessential results which are mainly based on descriptive statistics because our rela-tively small and heterogeneous data set prevented us from performing deeper andmore conclusive analyses. In addition, lack of space also provides us restrictions inthis context.

3 Selected Results

Most of our 166 respondents were males (Table 1). The respondents represented36 countries according to their present citizenship, and they had usually stayed per-manently in their native countries although about 60% had also worked abroad forbrief periods of time. They ages ranged from 24 to 70 years, and the mean wasabout 44 years (Fig. 1). There were no significant differences in the means of theages between the males and females when the t-test was used.

Table 1 Respondents according to their gender

Gender Frequency Percent

Female 34 20.5Male 132 79.5Total 166 100.0

The average age when the respondents became acquainted with the fuzzy sys-tems, i.e., their “fuzzy ages”, was about 29 years (ranging from 16 to 55 years;Fig. 2). According to the t-test, there were no significant differences in the means ofthese ages between the males and females. As Fig. 3 also shows, there was a signif-icant positive correlation between the respondents’ ages and their “fuzzy ages”.

About 70% were scholars, such as Professors and Lecturers, and about 84% hadDoctoral degrees (Figs. 4 and 5). According to the one-way analysis of variance(ANOVA) and Kruskal-Wallis non-parametric analysis, there were significant dif-ferences in the means of the ages between the titles of the respondents (Fig. 6),but these differences were not significant when their corresponding “fuzzy ages”were considered. Two-way ANOVA models with the additional factor Gender werealso used but this variable seemed to be insignificant in this context. The two-way

Page 95: Forging New Frontiers: Fuzzy Pioneers I

86 V.A. Niskanen

00. 4

2

0 0.6

2

00.8

2

00.0

3

0 0.2

3

0 0.4

3

00. 6

3

0 0.8

3

00.0

4

0 0.2

4

00.4

4

0 0.6

4

00. 8

4

00.0

5

00.2

5

0 0.4

5

00.6

5

0 0.8

5

00.0

6

00.2

6

00. 4

6

00.6

6

00.9

6

Age

0

2

4

6

8

10

ycne

uqer

F

Fig. 1 Ages of respondents

00.6

1

00.9

1

00.0

2

00.1

2

0 0.2

2

00.3

2

00.4

2

00.5

2

00.6

2

00. 7

2

00.8

2

00.9

2

00. 0

3

0 0. 1

3

00.2

3

00.3

3

00.4

3

00.5

3

00.6

3

00.7

3

0 0.8

3

00.9

3

00.0

4

0 0. 1

4

00. 2

4

00.4

4

0 0. 5

4

00.7

4

0 0.0

5

00. 1

5

00. 5

5

Age fuzzy

0

5

10

15

20

ycne

uq er

F

Fig. 2 Ages in which respondents first became acquainted with fuzzy systems (“fuzzy ages”)

ANOVA model also showed that there were no significant differences in the meansof fuzzy ages when Gender and Degree were our factors.

As regards the respondents’ function, most of them were involved in R&D, Ed-ucation or in several areas simultaneously (Fig. 7). Significant differences in the

Page 96: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 87

Age

20,00

30,00

40,00

50,00

yzzuf

eg

AGender

Female

Male

70,0020,00 30,00 40,00 50,00 60,00

Fig. 3 Ages vs. “fuzzy ages”

ProfessorDr., LecturerSeveralOther

Title

Fig. 4 Titles of respondents

Page 97: Forging New Frontiers: Fuzzy Pioneers I

88 V.A. Niskanen

BachelorMasterDr.

Other

Degree

Fig. 5 Respondents’ degrees

Manager

Mem

ber of tech staff

Engineer

Computer scientist

Professor

Dr.,Lecturer

Retired

Several

Title

20,00

30,00

40,00

50,00

60,00

70,00

ega f

o nae

M

Other

Fig. 6 The means of ages according to the titles

means of ages were found (Fig. 8), whereas no significant differences were in thecase of their “fuzzy ages” (standard ANOVA and Kruskall-Wallis modeling).

Only about 55% were members of the IFSA Societies, and most of the membersbelonged to EUSFLAT (about 25%; Fig. 9). Significant differences in the means of

Page 98: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 89

R&DEng / Computer sciEducationSeveralOther

Function

Fig. 7 Respondents’ functions

Management

R&DEng / Computer sci

Education

Consulting

Retired

OtherSeveral

Function

20,00

30,00

40,00

50,00

60,00

70,00

Mea

n o

f ag

e

Fig. 8 Ages according to the functions

the ages were found in this context (ANOVA, Fig. 10), whereas there were no signif-icant differences in the means of the “fuzzy ages”. According to Fisher’s exact test,there was no correlation between the IFSA membership and respondent’s degree.

Page 99: Forging New Frontiers: Fuzzy Pioneers I

90 V.A. Niskanen

No memberEUSFLATSBASeveralOther

MemIFSA

Fig. 9 Membership in the IFSA Societies

No mem

ber

SOFTEUSFLAT

NAFIPS

CFSAT

KFISSBA

SIGEF

HFAHAFSA

Several

MemIFSA

40,00

45,00

50,00

55,00

ega

fo

naeM

Fig. 10 Ages according to the IFSA membership

Page 100: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 91

Philosophy & Math

Control

Pattern recognition

Decision making

HW & SW eng

Other

Several

Fig. 11 Research areas

When the respondents’ research areas were examined, about 63% of them wereworking in several fields. The most popular single areas were decision making aswell as philosophy (including logic) and mathematics (Fig. 11). No significant dif-ferences were found in the means of the ages or “fuzzy ages” when Gender andResearch area were the factors in the two-way ANOVA models.

About 45% of the respondents had completed a doctoral thesis on fuzzy sys-tems within the years 1975–2005 (Fig. 12). There were no significant differencesin the means of these years between the males and females. There was a nega-tive correlation between the ages and the years when the thesis was completed(Fig. 13). About 7% of the respondents had never attained any fuzzy conference(Fig. 14).

About 13% had published at least one scientific book on fuzzy systems(Fig. 15), about 39% at least one journal article (Fig. 16) and about 40% at leastone conference paper (Fig. 17). No significant differences in the means of numbersof the publications between the males and the females were not found when thet-tests were used.

About 62% of the respondents had provided instruction in fuzzy systems, and41% of them had provided this in robotics, 31% in philosophy and mathematics,26% in control and 19% in pattern recognition. The first courses were held in 1971(Fig. 18). In instruction on fuzziness, about 9% had produced textbooks (1 to 4items), 17% digital material, 14% computer software and 2% other materials. Therewas a significant negative correlation between their ages and the year when they firstheld classes on fuzzy systems (Fig. 19).

Page 101: Forging New Frontiers: Fuzzy Pioneers I

92 V.A. Niskanen

7791

8791

9791

0891

3891

4 89 1

7 89 1

8 89 1

9 89 1

1 99 1

2991

3991

4991

5991

6991

7991

8991

9991

0002

1 00 2

2 00 2

3002

4002

5 00 2

Thesis

0

1

2

3

4

5

6

7

ycne

uqer

F

Fig. 12 Doctoral theses on fuzzy systems

20,00 30,00 40,00 50,00 60,00 70,00

Age

1975

1980

1985

1990

1995

2000

2005

sisehT

GenderFemaleMale

Fig. 13 Age vs. the year when fuzzy thesis completed

Page 102: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 93

never <1 1 2 3 4 5 6 >6

Attendconf

0

10

20

30

40F

req

uen

cy

Fig. 14 Respondents’ number of fuzzy conferences per year

1 2 3 4 5 10 13 55

Books

0

2

4

6

8

10

ycne

uqer

F

Fig. 15 Number of published scientific books per person

Page 103: Forging New Frontiers: Fuzzy Pioneers I

94 V.A. Niskanen

1 2 3 4 5 6 7 901 11 2 1 3 1 41 51 02 22 32 4 2 62 92 53 83 93 24 3 7 6833

159

100

2

Articles

0

1

2

3

4

5

6F

req

uen

cy

Fig. 16 Number of published articles per person

1 3 5 8

0 1 3 1 51 81 12 3 2 62 0 3 73 24 44 84 8 5

0 01

011

8 31

Confpapers

0

1

2

3

4

ycne

uqer

F

Fig. 17 Number of published conference papers per person

Page 104: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 95

1791

2 79 1

4 791

5791

6791

9791

1891

289 1

3 891

4891

5891

6891

789 1

889 1

9 89 1

099 1

2991

399 1

4991

599 1

6 99 1

7991

899 1

9 99 1

000 2

1002

2002

3 00 2

400 2

5 002

Teachyear

0

2

4

6

8

10

12

ycne

uqer

F

Fig. 18 When the respondents’ first courses on fuzzy systems were held

20,00 30,00 40,00 50,00 60,00 70,00

Age

1970

1980

1990

2000

2010

Tea

chye

ar

Fig. 19 Age vs. the year when the first fuzzy courses were held

Page 105: Forging New Frontiers: Fuzzy Pioneers I

96 V.A. Niskanen

About 58% of the respondents had made fuzzy applications, and they were madebetween the years 1971 and 2005 (Fig. 20). These included such areas as heavy in-dustry (about 7% of them), vehicle applications (4%), spacecraft applications (1%),consumer products (2%), financing and economics (8%), robotics (7%), computersoftware and hardware (10%), health care (6%) and military applications (2%).There was a significant negative correlation between their ages and the year whenthey first made their applications (Fig. 21). There were no significant differences inthe means between the males and females when the numbers of applications wereconsidered. This similar situation concerned the differences in the means of theyears when fuzzy applications were first made.

4 Summary and Discussion

We have considered the data set which was collected within the global survey onfuzzy research, instruction and development work in 2004–2005. We used an elec-tronic questionnaire form in a census study, and 166 responds were obtained. Sinceour data set was quite small, we were unable to draw any generalized conclusionsconcerning the entire fuzzy community but rather we can only sketch some tentativeresults and provide some guidelines for future studies in this area. Lack of space alsorestricted our examinations in this context. In addition, due to the small data set, wealso performed both parametric and non-parametric statistical analyses.

The foregoing results nevertheless show that we already have a global fuzzycommunity with the diversity of professions and for decades this community hasbeen very active in research, instruction and development work. It also seems thatyounger generations have been recruited well in these areas. However, we stillshould have much more females in this work as well as persons from the devel-opment countries.

Our research aims would also allow us to perform such qualitative research assemi-structured interviews and action research in order to examine the structures andunderlying connections of those human and institutional networks that can be foundbehind the fuzzy activities. In this context, as well as in our quantitative research,we could also apply the theory of networks, graph theory, concept maps, cognitivemaps and neuro-fuzzy modeling. For example, we could use neuro-fuzzy modelsin regression and cluster analysis. In addition, we could examine simulations ofemerging networks and their information flows by applying fuzzy cognitive maps.Hopefully, we can perform these studies in the further surveys.

In the future surveys we should collect more representative data sets because thistype of work is very important to the fuzzy community. In practice this means hun-dreds or even thousands of cases. If a sufficient number of cases can be collected, wecan support the future research design and work on fuzzy systems by revealing therelevant research areas and activities globally. Second, we can enhance the visibilityof the fuzzy community. Third, we can support the future comprehensive educa-tional planning on fuzzy systems globally. Fourth, we can promote better the mar-keting of fuzzy systems for decision makers, researchers, students and customers.

Page 106: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 97

1791

7791

8791

0 89 1

1891

2891

5 891

7 891

8891

989 1

0 99 1

1 99 1

2 991

4 991

599 1

6 99 1

7 991

8991

9 991

000 2

1 002

2002

3002

4002

5 00 2

Year_application

0

2

4

6

8

10

ycne

uqer

F

Fig. 20 Number of fuzzy applications

Age

1970

1980

1990

2000

2010

noitacil

ppa_rae

Y

20,00 30,00 40,00 50,00 60,00 70,00

Fig. 21 Age vs. the first year when fuzzy applications were made

Page 107: Forging New Frontiers: Fuzzy Pioneers I

98 V.A. Niskanen

Fifth, we can support the future global networking of persons, institutes and firmsin the area of fuzzy systems. In brief, we can thus convincingly disseminate the factthat the future is fuzzy.

Acknowledgments I express my gratitude to Professor Lotfi Zadeh for providing me the idea onperforming this survey. I also express my thanks to IFSA for supporting this work.

References

1. Cohen L and Manion L (1989) Research Methods in Education. Routledge, London.2. Guilford J and Fruchter B (1978) Fundamental Statistics in Psychology and Education.

McGraw-Hill, London.3. Niskanen V A (2003) Soft Computing Methods in Human Sciences. Springer Verlag,

Heidelberg.4. Niskanen V A (2004) questionnaire form,

https://kampela.it.helsinki.fi/elomake/lomakkeet/573/lomake.html5. SPSSTMStatistical Software, Version 12.0.1.

Page 108: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 99

Appendices

Appendix 1. Respondents’ present citizenship.

Country Frequency Percent

Argentina 3 1.8Australia 4 2.4Austria 1 .6Belgium 10 6.0Brazil 16 9.6Bulgaria 3 1.8Canada 4 2.4China 3 1.8Colombia 3 1.8Czech Republic 8 4.8Finland 5 3.0France 2 1.2Germany 6 3.6Greece 2 1.2Hungary 2 1.2India 8 4.8Iran 2 1.2Ireland 1 .6Israel 1 .6Italy 10 6.0Japan 1 .6Latvia 1 .6Netherlands 1 .6Nigeria 2 1.2Poland 2 1.2Portugal 1 .6Romania 4 2.4Russia 3 1.8Slovakia 3 1.8Slovenia 1 .6South Africa 1 .6Spain 34 20.5Taiwan 2 1.2UK 5 3.0Ukraine 2 1.2USA 9 5.4Total 166 100.0

Page 109: Forging New Frontiers: Fuzzy Pioneers I

100 V.A. Niskanen

Appendix 2. The electronic questionnaire form

World-Wide Survey on Fuzzy Systems

Dear Recipient of This Questionnaire Form,

Since information on world-wide statistics and research networks of fuzzy systemsactivities are continuously required in various instances of decision making, by theinitiative of Prof. Lotfi Zadeh the International Fuzzy Systems Association (IFSA)has nominated Docent, Dr. Vesa A. Niskanen at the University of Helsinki, Finland,to carry out this task as a Chair of an International Committee of IFSA, referred toas the IFSA Information Committee (website:http://www.honeybee.helsinki.fi/users/niskanen/ifsainfocom.htm ), in 2004–2005.

In practice Information Committee collects world-widely the data on researchgroups, applications, patents, types of products applying fuzzy logic and university-level instruction arrangements. According to results, aims and guidelines for futureglobal planning, decision making, research and instruction on fuzzy systems can besuggested. This survey also gives you a good opportunity to provide information onyour activities and promote visibility of your work world-widely.

The information you provide is also very valuable for future policies of IFSA and itwill be examined anonymously and strictly confidentially.

We wish that you can spend a little bit of your time to answer the questions providedbelow.In case of problems or questions concerning the filling of this form, please feel freeto contact Dr. Niskanen.An example of a filled form is on

http://www.honeybee.helsinki.fi/users/niskanen/surveyexample.pdf

******************************PLEASE REPLY PRIOR TO MARCH 20, 2005.******************************

Sincerely,

Lotfi ZadehIFSA Honorary President

Zenn BienIFSA President

Dr. Vesa A. NiskanenIFSA Information Committee Chair

Selected Results of the Global Survey

University of Helsinki, Dept. of Economics & ManagementPO Box 27, 00014 Helsinki, FinlandTel. +358 9 40 5032030, Fax +358 9 191 58096E-mail: [email protected]

Page 110: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 101

Part 1. Background information about you

1. Your gender:

Female

2. Your countryof citizenship whenyou were born:

3. Your presentcountry of citizen-ship:

4. Your year ofbirth (e.g. 1965):

5. In what yeardid you first timebecome acquaintedwith fuzzy systems(for example inyour work, studiesor research. E.g.1989)?

6. Countries in which you have principally studied or worked with fuzzysystems (max. 5):

Country #1

Country #2

Country #3

Country #4

Country #5

Male

Page 111: Forging New Frontiers: Fuzzy Pioneers I

102 V.A. Niskanen

President / Chair of the board / CEO

General manager / Manager

Owner

Member of technical staff

Engineer

Computer scientist

Professor / Associate prof. / Assistant prof.

Docent / Dr. / Reader / Lecturer / Instructor / Research scien-tist

Consultant

Retired

Other (please specify -->)

8. What is your highest university degree?

Bachelor

Master

Doctor

Other (please specify -->)

9. What is your principal function?

Management

Research / Development

Part 2. Your present working environment

7. What is your title?

Page 112: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 103

Retired

Other (please specify -->)

10. Are you a member of any IFSA Society? (c.f.http://www.pa.info.mie-u.ac.jp/~furu/ifsa/ )

No, I am not a member

SOFT

EUSFLAT

NAFIPS

CFSAT

KFIS

SBA

VFSS

SIGEF

FMSAC

HFA

FIRST

NSAIS

SCI

HAFSA

Selected Results of the Global Survey

Engineering / Computer science

Marketing / Sales / Purchasing

Education / Teaching

Consulting

Page 113: Forging New Frontiers: Fuzzy Pioneers I

104 V.A. Niskanen

Part 3. Your activities in fuzzy systems research at scientific level (skipPart 3 if you do not have any)

11. What are your principal research areas on fuzzy systems (tick all theitems that fit your case)?

Philosophy (including logic) or mathematics

Control

Pattern recognition

Decision making

Robotics

Hardware or software engineering

Other (please specify -->)

12. If you havewritten a doctoralthesis on fuzzy sys-tems, in what yeardid you completedit (the first one ifyou have many, e.g.1989; leave blank ifnone):

13. How many scientific publications have you written on fuzzy systems?(put numbers to boxes, e.g. 3; put 0 if none)

You are theonly author

With co-author(s)

Books in English

Articles in edited books or refereedinternational journals in English

Selected Results of the Global Survey

Papers in international conferenceproceedings in English

Page 114: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 105

14. How often have you attended international conferences, seminars orworkshops on fuzzy systems (average per year)?

Never

Less than once a year

Once per year

Twice per year

Three times per year

Four times per year

Five times per year

Six times per year

More than six times per year

Books in other languages

Articles in edited books or refereedinternational journals in other lan-guages

Papers in conference proceedings inother languages

Page 115: Forging New Frontiers: Fuzzy Pioneers I

106 V.A. Niskanen

undergraduate orpostgraduate stu-dents on fuzzy sys-tems (e.g. 1993,leave blank ifnone):

17. What are your principal areas in education or teaching of fuzzy sys-tems (tick all the items that fit your case)?

Philosophy (including logic) or mathematics

Control

Pattern recognition

Decision making

Robotics

Hardware or software engineering

Other (please specify -->)

18. How many items of instructional material have you published or pro-duced on fuzzy systems? (Put numbers to boxes, e.g. 3; put 0 if none)

You are the onlypublisher or pro-ducer

Withsomeone else

Textbooks in English

Part 4. Your activities at university-level fuzzy systems teaching or educa-tion (skip Part 4 if you do not have any)

15. In what yeardid you start teach-ing (i.e., holdingclasses) on fuzzysystems (e.g. 1993,leave blank ifnone): 16. In what yeardid you start tutor-ing or supervising

Page 116: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 107

Selected Results of the Global Survey

Computer software in other lan-guages

Other material

Computer software in English

Textbooks in other languages

Digital material in other lan-guages

Digital material in English(www pages, slide presentations,videos, DVDs etc.)

Page 117: Forging New Frontiers: Fuzzy Pioneers I

108 V.A. Niskanen

Part 5. Your activities in fuzzy systems applications (skip Part 5 if you donot have any)

19. In what yeardid you start doingapplications withfuzzy systems (e.g.1997): 20. How many fuzzy systems applications have you done? (put numbersto boxes, e.g. 3; put 0 if none)

For com-mercial pur-poses whenyou are theonly de-signer

For com-mercialpurposes asa teammember

For otherpurposes(scientific,educationaletc) whenyou are theonly designer

For otherpurposes(scientific,educationaletc) as ateam mem-ber

To heavyindustry

To auto-mobiles,trains, air-planes orother vehi-cles

To space-crafts

To homeappliances(washingmachinesetc.), cam-eras,phones)

To fi-nance oreconomics

To robot-ics

To com-puter soft-ware orhardware

Page 118: Forging New Frontiers: Fuzzy Pioneers I

Selected Results of the Global Survey 109

21. In which countries are your applications principally used (mentionmax. 5 countries)

Country #1:

Country #2:

Country #3:

Country #4:

Country #5:

To healthcare instru-ments or ser-vices

To mili-tary pur-poses

Selected Results of the Global Survey

22. If you haveany comments oradditional informa-tion on the forego-ing questions,please include themin here. You canalso put your web-site address here ifyou like it to bementioned in thesurvey report.

Proceed

Reset

Thank you very much for your contribution.

Part 6. Additional information

Page 119: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation

László T. Kóczy, János Botzheim and Tamás D. Gedeon

Abstract This paper focuses on two essential topics of the fuzzy area. The firstis the reduction of fuzzy rule bases. The classical inference methods of fuzzy sys-tems deal with dense rule bases where the universe of discourse is fully covered.By applying sparse or hierarchical rule bases the computational complexity can bedecreased. The second subject of the paper is the introduction of some fuzzy rulebase identification techniques. In some cases the fuzzy rule base might be givenby a human expert, but usually there are only numerical patterns available and anautomatic method has to be applied to determine the fuzzy rules.

1 Introduction

An overview of fuzzy model identification techniques and fuzzy rule interpolationis presented in this paper. In Sect. 2 the historical background of the fuzzy systemsis described. Section 3 introduces the fuzzy rule base reduction methods, includingfuzzy rule interpolation, hierarchical fuzzy rule bases, and the combination of thesetwo methods. The next part of the paper deals with fuzzy model identification. InSect. 4 clustering based fuzzy rule extraction is discussed, and in Sect. 5 our mostrecently proposed technique, the bacterial memetic algorithm is introduced.

László T. KóczyDepartment of Telecommunications and Media Informatics, Budapest University of Technologyand Economics, HungaryInstitute of Information Technology and Electrical Engineering, Széchenyi István University, Gyor,Hungary

János BotzheimDepartment of Telecommunications and Media Informatics, Budapest University of Technologyand Economics, HungaryDepartment of Computer Science, The Australian National University

Tamás D. GedeonDepartment of Computer Science, The Australian National University

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 111C© Springer 2007

Page 120: Forging New Frontiers: Fuzzy Pioneers I

112 L.T. Kóczy et al.

2 Background

Fuzzy rule based models were first proposed by Zadeh (1973) and later practicallyimplemented with some technical innovations by Mamdani and Assilian (1975).If . . . then . . . rules contain an arbitrary number of variables xi with antecedent de-scriptors, formulated either by linguistic terms or directly by convex and normalfuzzy sets (fuzzy numbers) in the If . . . part, and one or several output variables yiand their corresponding consequent terms/membership functions in the then . . . part.An alternative model was proposed by Takagi and Sugeno (1985) where the conse-quent part was represented by y= f(x) type functions. Later Sugeno and his teamproposed the basic idea of hierarchical fuzzy models where the meta-level con-tained rules with symbolic outputs referring to further sub-rule bases on the lowerlevels. All these types of fuzzy rule based models might be interpreted as sophisti-cated and human friendly ways of defining (approximate) mappings from the inputspace to the output space, the rule bases being technically fuzzy relations of X×Ybut representing virtual “fuzzy graphs” of extended (vague) functions in the statespace.

A great advantage of all these models compared to more traditional symbolicrule models is the inherent interpolation property of the fuzzy sets providing a coverof the input space, with the kernel values of the antecedents corresponding to theareas (or points) where supposedly exact information is available – typical points ofthe graph, and vaguer “grey” areas bridging the gaps between these characteristicvalues by the partially overlapping membership functions or linguistic terms. It washowever necessary in all initial types of fuzzy models that the antecedents formed amathematically or semantically full cover, without any gaps between the neighbor-ing terms. The “yellow tomato problem” proposed by Kóczy and Hirota pointed outthat often it was not really necessary to have a fuzzy cover, since “yellow tomatoes”could be interpreted as something between “red tomatoes” and “green tomatoes”,with properties lying also in between, the former category describing “ripe toma-toes”, the latter “unripe tomatoes”, yellow tomatoes being thus “half ripe”. Lin-ear fuzzy rule interpolation proposed in 1990 provided a new algorithm extendingthe methods of reasoning and control to sparse fuzzy covers where gaps betweenadjoining terms could be bridged with the help of this new approach, deliveringthe correct answer both in the case of linguistic rules like the “Case of the yellowtomato” and in more formal situations where membership functions were definedonly by mathematical functions. This idea was later extended to many non-linearinterpolation methods, some of them providing very practical means to deal withreal life systems.

In 1993 and in a more advanced and practically applicable way in 1997 the au-thors proposed the extension of fuzzy interpolation to a philosophically completelynew area, the sub-models of the hierarchical rule base itself. Interpolation of sub-models in different sub-spaces raised new mathematical problems. The solution wasgiven by a projection based approach with local calculations and subsequent substi-tution into the meta-level rule base, replacing sub-model symbols by actual fuzzysub-conclusions.

Page 121: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 113

After an overview of these types of fuzzy models we will deal with the problemof model identification. Often human experts provide the approximate rules – likethey did in the hierarchical fuzzy helicopter control application by Sugeno referredto above, but more often, there are only input-output data observed on a black boxsystem as the starting point to construct the fuzzy rule structure and the rules them-selves.

Sugeno and Yasukawa (1993) proposed the use of fuzzy c-means (FCM) clus-tering originally introduced by Bezdek (1981) for identifying output clusters in thedata base, from where rules could be constructed by best fitting trapezoidal member-ship functions applied on the input projections of the output clusters. This methodhas limited applicability, partly because of the conditions of FCM clustering, andpartly because of the need to have well structured behavior from the black box underobservation.

In the last few years our group has introduced a variety of identification methodsfor various types of fuzzy models. The FCM based approach could be essentiallyextended to hierarchical models as well, and was applied to a real life problem(petroleum reservoir characterization (Gedeon et al., 1997; Gedeon et al., 2003))successfully. Another was the bacterial evolutionary algorithm. The use ofLevenberg-Marquardt (LM) optimization for finding the rule parameters was bor-rowed from the field of neural networks. These latter approaches all had their ad-vantages and disadvantages, partly concerning convergence speed and accuracy,partly locality and globality of the optimum. The most recent results (Botzheimet al., 2005) showed that the combination of bacterial evolutionary algorithm andLM, called Bacterial Memetic Algorithm delivered very good results compared tothe previous ones, with surprisingly good approximations even when using verysimple fuzzy models. Research on extending this new method to more complicatedfuzzy models is going on currently.

3 Fuzzy Rule Base Reduction

The classical approaches of fuzzy control deal with dense rule bases where the uni-verse of discourse is fully covered by the antecedent fuzzy sets of the rule base ineach dimension, thus for every input there is at least one activated rule. The mainproblem is the high computational complexity of these traditional approaches.

If a fuzzy model contains k variables and maximum T linguistic terms in eachdimension, the order of the number of necessary rules is O(T k). This expression canbe decreased either by decreasing T , or k, or both. The first method leads to sparserule bases and rule interpolation, and was first introduced by Kóczy and Hirota(see e.g. Kóczy and Hirota, 1993a; Kóczy and Hirota, 1993b). The second, moreeffectively, aims to reduce the dimension of the sub-rule bases by using meta-levelsor hierarchical fuzzy rule bases. The combination of these two methods leads to thedecreasing of both T and k, and was introduced in (Kóczy and Hirota, 1993c).

Page 122: Forging New Frontiers: Fuzzy Pioneers I

114 L.T. Kóczy et al.

3.1 Fuzzy Rule Interpolation

The rule bases containing gaps require completely different techniques of reason-ing from the traditional ones. The idea of the first interpolation technique, the KHmethod, was proposed in (Kóczy and Hirota, 1993a; Kóczy and Hirota, 1993b), andconsiders the representation of a fuzzy set as the union of its α-cuts:

A =⋃

α∈[0,1]

αAα (1)

This method calculates the conclusion by its α-cuts. Theoretically all α-cuts shouldbe considered, but for practical reasons only a finite set is taken into considerationduring the computation. The KH rule interpolation algorithm requires the followingconditions to be fulfilled: the fuzzy sets in both premises and consequences have tobe convex and normal (CNF) sets, with bounded support, having continuous mem-bership functions. Further, there should exist a partial ordering among the CNF setsof each variable. We shall use the vector representation of fuzzy sets, which assignsa vector of its characteristic points to every fuzzy set. The representation of thepiecewise linear fuzzy set A will be denoted by vector a = [a−m, . . . , a0, . . . , an],where ak (k ∈ [−m, n]) are the characteristic points of A and a0 is the referencepoint of A having membership degree one. A partial ordering among CNF fuzzysets (convex and normal fuzzy sets) is defined as: A ≺ B if ak ≤ bk (k ∈ [−m, n]).

The basic idea of fuzzy rule interpolation (KH-interpolation) is formulated in theFundamental Equation of Rule Interpolation (FERI):

D(A∗, A1) : D(A∗, A2) = D(B∗, B1) : D(B∗, B2) (2)

In this equation A∗ and B∗ denote the observation and the corresponding conclusion,while R1=A1→B1, R2=A2 → B2 are the rules to be interpolated, such that A1 ≺A∗ ≺ A2 and B1 ≺ B2. If in some sense D denotes the Euclidean distance betweentwo symbols, the solution for B∗ results in simple linear interpolation. If D = d(the fuzzy distance family), linear interpolation between corresponding α-cuts isperformed and the generated conclusion can be computed as below, (as it is firstdescribed in (Kóczy and Hirota, 1993c)):

b∗k =b1k

d(a1k, a∗k )+ b2k

d(a2k, a∗k )

1

d(a1k, a∗k )+ 1

d(a2k, a∗k )

(3)

where the first index (1 or 2) represents the number of the rule, while the second(k) the corresponding α-cut. From now on d(x, y) = |x − y|, so that (3) becomes:K H b∗k = (1− λk)b1k + λkb2k , where λk = (a∗k − a1k)/(a2k − a1k) (for the left andright side, respectively).

Page 123: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 115

This family of interpolation techniques has various advantageous properties,mainly simplicity, convenient practical applicability, stability, however, the directapplicability of this result is limited as the points defined by these equations oftendescribe an abnormal membership function for B∗ that needs further transformationto obtain a regular fuzzy set. Even then, the conclusion may need further transfor-mation, since it may not be normal. Furthermore, the method does not conservepiecewise linearity of the rules and the observation. In order to avoid this difficulty,some alternative or modified interpolation algorithms were proposed, which solvethe problem of abnormal solutions.

Some of these algorithms rely on the same idea of applying the FundamentalEquation of rule interpolation for some kind of metric, or in some cases non-metricdissimilarity degree. However, the way of measuring this distance or dissimilarityis varied. In (Gedeon and Kóczy, 1996), the distance is measured only for α=1 andso the position and length of the core in each of the k input dimensions determinethe position and the core of conclusion in the output space Y . The remaining in-formation is contained in the shape of the flanks of the fuzzy sets describing therules, (the parts where 0<μ<1). Depending on the type of the membership func-tions (trapezoidal or more general) a single value of fuzziness, F, is the basis for thecalculation of the points of the flanks of B∗, where F = |S0-S1|, S is an arbitraryfuzzy set, the subscripts refer to the corresponding minimal or maximal points of thecore and subscript, “core and support points”, respectively. Alternatively, a functionof α-dependent fuzziness values based on either the core or the support points is thebase for this calculation. Here however a logarithmic type of dissimilarity is applied.These latter methods guarantee that the resulting conclusion is always “normal” (inthe sense of not being abnormal).

In (Baranyi et al., 1999) a coordinate transformation of the data is used beforeapplying the KH interpolation, which guarantees the normality of the conclusionobtained after interpolation.

3.2 Hierarchical Fuzzy Rule Bases

The basic idea of using hierarchical fuzzy rule bases is the following: Often themulti-dimensional input space X = X1 × X2 × . . . × Xm can be decomposed, sothat some of its components, e.g. Z0 = X1 × X2 × . . .× X p determine a subspaceof X (p < m), so that in Z0 a partition Π={D1, D2, . . ., Dn} can be determined:

n⋃i=1

Di = Z0.

In each element of Π , i.e. Di , a sub-rule base Ri can be constructed with localvalidity. In the worst case, each sub-rule base refers to exactly X/Z0 = X p+1×. . .×Xm . The complexity of the whole rule base O(T m) is not decreased, as the size of R0is O(T p), and each Ri , i > 0, is of order O(T m−p), O(T p)×O(T m−p) = O(T m).

A way to decrease the complexity would be finding in each Di a proper subset of{X p+1 × . . .× Xm}, so that each Ri contains only less than m − p input variables.In some concrete applications in each Di

Page 124: Forging New Frontiers: Fuzzy Pioneers I

116 L.T. Kóczy et al.

a proper subset of {X p+1, . . . , Xm} can be found so that each Ri contains only lessthan m–p input variables, and the rule base has the following structure:

R0: If z0 is D1 then use R1If z0 is D2 then use R2. . .

If z0 is Dn then use Rn

R1: If z1 is A11 then y is B11If z1 is A12 then y is B12. . .

If z1 is A1r1 then y is B1r1

R2: If z2 is A21 then y is B21If z2 is A22 then y is B22. . .

If z2 is A2r2 then y is B2r2

Rn : If zn is An1 then y is Bn1If zn is An2 then y is Bn2. . .

If zn is Anrn then y is Bnrn

where zi ∈ Zi , Z0 × Zi being a proper subspace of X for i=1, .., n.

If the number of variables in each Zi is ki < m − p andn

maxi=1

ki = K < m − p,

then the resulting complexity will be O(T p+K ) < O(T m), so the structured rulebase leads to a reduction of the complexity.

The task of finding such a partition is often difficult, if not impossible, (some-times such a partition does not even exist). There are cases when, locally, somevariables unambiguously dominate the behavior of the system, and consequentlythe omission of the other variables allows an acceptably accurate approximation.The bordering regions of the local domains might not be crisp or even worse, thesedomains overlap. For example, there can be a region D1, where the proper sub-space Z1 dominates, and another region D2, where another proper subspace Z2 issufficient for the description of the system, however, in the region between D1 andD2 all variables in [Z1 × Z2] play a significant role ([.×.] denoting the space thatcontains all variable that occur in either argument within the brackets). In this case,sparse fuzzy partitions can be used, so that in each element of the partition a propersubset of the remaining input state variables is identified as exclusively dominant.Such a sparse fuzzy partition can be described as follows: �={D1, D2, . . . , Dn} and

n⋃i=1

Core(Di ) ⊂ Z0 in the proper sense (fuzzy partition). Evenn⋃

i=1Supp(Di ) ⊂ Z0

is possible (sparse partition). If the fuzzy partition chosen is informative enoughconcerning the behavior of the system, it is possible to interpolate its model amongthe elements of �.

Each element Di will determine a sub-rule base Ri referring to another subset ofvariables. The technical difficulty is how to combine the “sub-conclusions” B∗i withthe help of R0 into the final conclusion.

Page 125: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 117

3.3 Fuzzy Rule Interpolation in Hierarchical Rule Bases

This section deals with reduction of both the maximum T linguistic terms, and thek variables.

Let us assume that the observation on X is A∗ and its projections are:

A∗0 = A∗/Z0, A∗1 = A∗/Z1, A∗2 = A∗/Z2.

Using the Fundamental Equation, the two sub-conclusions, obtained from the twosub-rule bases R1 and R2 are:

K H b1∗k = (1− λ1

k)b11k + λ1

kb12k, and

K H b2∗k = (1− λ2

k)b21k + λ2

kb22k respectively.

(The superscript shows the reference to the rule base R1 and R2.)Finally, by substituting the sub-conclusions into the meta-rule base we get:

K H b∗k = (1− λ0k)b

1∗k + λ0

kb2∗k (4)

The steps of the algorithm are as follows:

1. Determine the projection A0∗ of the observation A∗ to the subspace of the fuzzy

partition �.2. Find the interpolating rules.3. Determine λ0

k .4. For each Ri determine Ai

∗ the projection of A∗ to Zi . Find the interpolating rulesin each Ri .

5. Determine the sub-conclusions for each sub-rule base Ri .6. Using the sub-conclusions from step 5, compute the final conclusion according

to (4).

Figure 1 shows an example of the algorithm. In step 4, in each sub-rule base, adifferent inference engine can be applied, e.g., if the sub-rule base itself is dense,the Mamdani-algorithm (or one of its variations), or in any case one of the interpo-lation algorithms can be used. It might be reasonable to apply the KH interpolationor one of the methods summarized above in step 4, while in step 5 usually thebest recommendation is the method in (Gedeon and Kóczy, 1996) as it is directlyapplicable also for flanking domains Di with wide areas and projected observationsA0∗ with narrow supports, e.g. crisp singletons.

4 Fuzzy Model Identification

One of the crucial problems of fuzzy rule based modeling is how to find an optimalor at least a quasi-optimal rule base for a certain system. In most applications there is

Page 126: Forging New Frontiers: Fuzzy Pioneers I

118 L.T. Kóczy et al.

R0

If z0 is D1 then use R1

If z0 is D2 then use R2

R1

If z1 is A11 then y is B11

If z1 is A12 then y is B12

R2

If z2 is A21 then y is B21

If z2 is A22 then y is B22

A* / Z1

A* B*

B2*B1

*

A* / Z2

R0

If z0 is D1 then use R1

If z0 is D2 then use R2

R1

2 R2

R1

If z1 is A11 then y is B11

If z1 is A12 then y is B12

R1

If z1 is A11 then y is B11

If z1 is A12 then y is B12

R2

If z2 is A21 then y is B21

If z2 is A22 then y is B22

R2

If z2 is A21 then y is B21

If z2 is A22 then y is B22

A* / Z1A* / Z1

A*A* B*B*

B2*B2*B1

*B1*

A* / Z2A* / Z2

Fig. 1 Interpolation in a hierarchical model

no human expert available, thus some automatic method to determine the fuzzy rulebase must be employed. In this section and in the next one some of these methodsare introduced.

4.1 Clustering-based Rule Extraction Technique

Recently, clustering-based approaches have been proposed for rule extraction (Wonget al., 1997; Wang and Mendel, 1992). Most of the techniques use the idea of par-titioning the input space into fixed regions to form the antecedents of the fuzzyrules. Although these techniques have the advantage of efficiency, they may lead tothe creation of a dense rule base that suffers from rule explosion. In general, thenumber of rules generated is T k where k is the number of input dimensions and T isthe terms per input. In this case, the number of rules generated grows exponentiallywith the increase of input dimensions. Due to this reason, the techniques are notsuited for generating fuzzy rule bases that have a large number of input dimensions.

Among the rule extraction techniques proposed in the literature, Sugeno andYasukawa’s (Sugeno and Yasukawa, 1993) technique (referred to as “SY method”hereafter) is one of the earliest work that emphasize the generation of a sparserule base. The SY approach clusters only the output data and induces the rulesby computing the projections to the input domains of the cylindrical extensionsof the fuzzy clusters. This way, the method produces only the necessary numberof rules for the input-output sample data (more details later). The paper (Sugenoand Yasukawa, 1993) discusses the proposed technique at the methodological levelleaving out some implementation details. The SY technique was further examined in(Tikk et al., 2002) where additional readily implementable techniques are proposedto complete the modeling methodology.

Page 127: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 119

In the first step of SY modeling, the Regularity criterion (Ihara, 1980) is used toassist in the identification of “true” input variables that have significant influenceon the output. The input variables that have little or no influence on the output areignored for the rest of the process. The true input variables are then used in the actualrule extraction process. The rule extraction process starts with the determination ofthe partition of the output space. This is done by using fuzzy c-means clustering(Bezdek, 1981) (see Sect. 4.2).

For each output fuzzy cluster Bi resulting from the fuzzy c-means clustering, acluster in the input space Ai can be induced. The input cluster can be projected ontothe various input dimensions to produce rules of the form:

If x1is Ai1and x2is Ai2and . . . and xnis Ainthen y is Bi

4.2 Fuzzy c-Means Clustering

Given a set of data, Fuzzy c-Means clustering (FCMC) performs clustering by iter-atively searching for a set of fuzzy partitions and the associated cluster centers thatrepresent the structure of the data as best as possible. The FCMC algorithm relies onthe user to specify the number of clusters present in the set of data to be clustered.Given the number of cluster c, FCMC partitions the data X = {x1, x2, . . . , xn} intoc fuzzy partitions by minimizing the within group sum of squared error objectivefunction as follows (5).

Jm(U, V ) =n∑

k=1

c∑i=1

(Uik)m‖xk − vi‖2, (5)

1 ≤ m ≤ ∞

where Jm(U, V ) is the sum of squared error for the set of fuzzy clusters representedby the membership matrix U , and the associated set of cluster centers V . ||.|| issome inner product induced norm. In the formula, ||xk − vi ||2 represents the dis-tance between the data xk and the cluster center vi . The squared error is used asa performance index that measures the weighted sum of distances between clustercenters and elements in the corresponding fuzzy clusters. The number m governs theinfluence of membership grades in the performance index. The partition becomesfuzzier with increasing m and it has been shown that the FCMC algorithm convergesfor any m ∈ (1,∞). The necessary conditions for (5) to reach its minimum are

Uik =⎛⎝

c∑j=1

( ‖xk − vi‖‖xk − v j‖

)2/(m−1)

⎞⎠−1

∀i,∀k (6)

and

Page 128: Forging New Frontiers: Fuzzy Pioneers I

120 L.T. Kóczy et al.

vi =

n∑k=1

(Uik)m xk

n∑k=1

(Uik)m(7)

In each iteration of the FCMC algorithm, matrix U is computed using (6) and theassociated cluster centers are computed as (7). This is followed by computing thesquare error in (5). The algorithm stops when either the error is below a certaintolerance value or its improvement over the previous iteration is below a certainthreshold.

The FCMC algorithm cannot be used in situations where the number of clustersin a set of data is not known in advance. Since the introduction of FCMC, a rea-sonable amount of work has been done on finding the optimal number of clusters ina set of data. This is referred to as the cluster validity problem. Among numerousalternative cluster validity indices proposed in the literature the most suitable provedto be the one proposed by Fukuyama and Sugeno (1989).

S(c) =n∑

k=1

c∑i=1

(Uik)m(‖xk − vi‖2 − ‖vi − x‖2) (8)

2 < c < n,

where n is the number of data points to be clustered; c is the number of clusters;xk is the kth data, x is the average of data; vi is the i th cluster center; Uik is themembership degree of the kth data with respect to the i th cluster and m is thefuzzy exponent. The number of clusters, c, is determined so that S(c) reaches alocal minimum as c increases. The terms ||xk − vi || and ||vi − x || represent thevariance in each cluster and variance between clusters respectively. Therefore, theoptimal number of clusters is found by minimizing the distance among data and thecorresponding cluster center and maximizes the distance among data in differentclusters. Other cluster validity indices can be found in (Yang and Wu, 2001).

4.3 Hierarchical Fuzzy Modeling

Fuzzy clustering plays an important role in feature selection. The idea is that givena set of clusters, the most important subset of features (true inputs) can be selectedby considering its capability to separate the clusters (Tikk and Gedeon, 2000;Pal, 1992). We use the interclass separability criterion for this purpose. Considera set of N input-output pairs F = {X; y}, X = {xi |i ∈ I } where I is the indexset, xi and y are column vectors. By deleting some features (input variables), weobtain a subspace X ′ = {xi |i ⊂ I }. Suppose that the input X is clustered into clus-ters Ci (i = 1, . . . , Nc) then the criterion function for feature ranking based on theinterclass separability is formulated by means of the following fuzzy between-class(9) and within-class (11) scatter (covariance) matrices.

Page 129: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 121

Qb =Nc∑

i=1

N∑j=1

μmi j (vi − v)(vi − v)T (9)

Qi = 1N∑

j=1μm

i j

N∑j=1

μmi j (x j − vi )(x j − vi )

T (10)

Qw =Nc∑

i=1

Qi (11)

where

v = 1

Nc

Nc∑i=1

vi (12)

Here, vi is given by (7). The criterion is a trade off between Qb (9) and Qw (11),often expressed as:

J (X ′) = tr (Qb)

tr (Qw)(13)

where ‘tr’ denotes the trace of a matrix. In Tikk and Gedeon (2000), the set ofclasses C is determined by clustering the output space using fuzzy clustering algo-rithms such as fuzzy c-means (Bezdek, 1981). The elements of the resulting partitionmatrix U = {μik |i = 1 . . . Nc , k = 1 . . . N} are then used as weights (9–12). Eachfeature can be ranked using the sequential backwards algorithm. Firstly, differentsubsets of data are obtained by temporary deleting each feature. This is followed bydeleting permanently the feature whose removal resulted in the largest value. Thisprocess is repeated until all features are deleted and the order of the deleted variablesgives their rank of importance.

From here, we obtain a set of features ordered (ascending) by their importance.The set of true inputs can then be determined by selecting the n most importantfeatures that minimizes:

∑i

⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩

yi − de f uzz

⎛⎜⎜⎜⎜⎝

∑c

μci

(c∑

j=1

( ‖xk − vi‖‖xk − v j‖

)2/(m−1))

⎞⎟⎟⎟⎟⎠

⎫⎪⎪⎪⎪⎬⎪⎪⎪⎪⎭

2

(14)

where vi is given by (7). Here, defuzz(.) denotes a defuzzification method, such asthe center of area (COA), that is used in the fuzzy inference.

Page 130: Forging New Frontiers: Fuzzy Pioneers I

122 L.T. Kóczy et al.

4.3.1 Finding � and Z0

The main requirement of a reasonable � is that each of its elements Di can bemodelled by a rule base with local validity. In this case, it is reasonable to expectDi to contain homogeneous data. The problem of finding � can thus be reduced tofinding homogeneous structures within the data. This can be achieved by clusteringalgorithms.

The subspace Z0 is used by meta-rules to select the most appropriate sub-rulebase to infer the output for a given observation (i.e. system input). In general, themore separable are the elements in �, the easier the sub-rule base selection be-comes. Therefore, the proper subspace can be determined by means of some cri-terion that ranks the importance of different subspaces (combination of variables)based on their capability in separating the components Di .

Unfortunately, the problems of finding � and Z0 are not independent of eachother. Consider the following helicopter control fuzzy rules extracted from the hier-archical fuzzy system in (Sugeno, 1991):

If distance (from obstacle) is small then hoverHover:

if (helicopter) body rolls right thenmove lateral stick leftward

if (helicopter) body pitches forward thenmove longitudinal stick backward

On one hand, we need to rely on the feature selection technique to find a “good”subspace (Z0) for the clustering algorithm to be effective. On the other hand, mostof the feature selection criteria assume the existence of the clusters beforehand.

One way to tackle this problem is to adopt a projection based approach. The dataare first projected to each individual dimension and fuzzy clustering is performedon the individual dimensions. The number of clusters C for clustering can be cho-sen arbitrarily large. With the set of one dimensional fuzzy clusters obtained, theseparability (importance) of each input dimension can be determined separately bycomputing (13). With the ranking, we can then select the n most important featuresto form the n-dimensional subspace Z0.

From here, a hierarchical fuzzy system can be obtained using the followingsteps:

1. Perform fuzzy c-means clustering on the data along the subspace Z0. The optimalnumber of clusters, C , within the set of data is determined by means of the FSindex (8).

2. The previous step results in a fuzzy partition � = {D1, . . . DC}. For each com-ponent in the partition, a meta rule is formed as:If Z0 is Di then use Ri

3. From the fuzzy partition �, a crisp partition of the data points is constructed. I.e.for each fuzzy cluster Di , the corresponding crisp cluster of points is determinedas Pi = {p|μi (p) > μ j (p)∀ j = i}.

Page 131: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 123

4. Construct the sub-rule bases. For each crisp partition Pi , apply a feature extrac-tion algorithm to eliminate unimportant features. The remaining features (knowas true inputs) are then used by a fuzzy rule extraction algorithm to create thefuzzy rule base Ri . Here, we suggest the use of the projection-based fuzzy mod-eling approach (Chong et al., 2002). We remark however, that the hierarchicalfuzzy modeling scheme does not place any restriction on the technique used. Ifmore hierarchical levels are desired, repeat steps 1 – 4 with the data points in Pi

using another subspace.

4.3.2 Modeling

Now, the proposed fuzzy modeling technique is described. The algorithm is as fol-lows:

1. Rank the importance of input variables, using fuzzy clustering and the InterclassSeparability Criterion and let F be the set of features ordered (ascending) bytheir importance

2. For i = 1 . . . |F |a. Construct a hierarchical fuzzy system using the subspace Z0 = X1× . . .×Xi .b. Compute εi to be the re-substitution error of the fuzzy system constructed

during the i th iteration.c. If i > 1 and εi > εi−1, stop

end for3. Perform parameter tuning. The completed hierarchical fuzzy rule base then goes

through a parameter identification process where the parameters of the mem-bership function used in the fuzzy rules are adjusted on a trial and error ba-sis to improve the overall performance. The method proposed in (Sugeno andYasukawa, 1993) adjusts each of the trapezoidal fuzzy set parameters in bothdirections and chooses the one that gives the best system performance.

5 Bacterial Memetic Algorithm

There are various successful evolutionary optimization algorithms known fromthe literature. The advantage of these algorithms is their ability to solve andquasi-optimize problems with non-linear, high-dimensional, multi-modal, and dis-continuous character. The original genetic algorithm was based on the process ofevolution of biological organisms. These processes can be easily applied in opti-mization problems where one individual corresponds to one solution of the problem.A more recent evolutionary technique is called the bacterial evolutionary algorithm(Nawa and Furuhashi, 1999; Botzheim et al., 2002). This mimics microbial ratherthan eukaryotic evolution. Bacteria share chunks of their genes rather than performneat crossover in chromosomes. This mechanism is used in the bacterial mutation

Page 132: Forging New Frontiers: Fuzzy Pioneers I

124 L.T. Kóczy et al.

Rule 1 Rule 2 Rule 3 Rule 4 … Rule R

a31= b31= c31= d31= a32= b32= c32= d32= a3= b3= c3= d3=4.3 5.1 5.4 6.3 1.2 1.3 2.7 3.1 2.6 3.8 4.1 4.4

Input 1 Input 2

Rule 1 Rule 2 Rule 3 Rule 4 … Rule R

a31= b31= c31= d31= a32= b32= c32= d32= a3= b3= c3= d3=4.3 5.1 5.4 6.3 1.2 1.3 2.7 3.1 2.6 3.8 4.1 4.4

Output

Fig. 2 Encoding of the fuzzy rules

and in the gene transfer operations. For the bacterial algorithm, the first step is todetermine how the problem can be encoded in a bacterium (chromosome). Our taskis to find the optimal fuzzy rule base for a pattern set. Thus, the parameters of thefuzzy rules must be encoded in the bacterium. The parameters of the rules are thebreakpoints of the trapezoids, thus, a bacterium will contain these breakpoints. Forexample, the encoding method of a fuzzy system with two inputs and one outputcan be seen in Fig. 2.Neural networks training algorithms can also be used for fuzzy rule optimization(Botzheim et al., 2004). They result in an accurate local optimum. Incorporatingthe neural network training algorithm with the bacterial approach, the advantagesof both methods can be utilized in the optimization process. The hybridization ofthese two methods leads to a new kind of memetic algorithm, because the bacterialtechnique is used instead of the classical genetic algorithm, and the Levenberg-Marquardt as the local searcher. This method is the Bacterial Memetic Algorithm(BMA) (Botzheim et al., 2005). The flowchart of the algorithm can be seen in Fig. 3.The difference between the BEA and the BMA is that the latter contains a localsearcher step, the Levenberg-Marquardt procedure.

5.1 Initial Population Creation

First the initial (random) bacteria population is created. The population consists ofNind bacteria. This means that all membership functions in the bacteria must berandomly initialized. The initial number of rules in a single bacterium is initiallyconstant Nrule . So, Nind (k + 1)Nrule membership functions are created, where k isthe number of input variables in the given problem and each membership functionhas four parameters.

5.2 Bacterial Mutation

The bacterial mutation is applied to each bacterium one by one (Nawa andFuruhashi, 1999). First, Nclones copies (clones) of the rule base are generated. Thena certain part of the bacterium (e.g. a rule) is randomly selected and the parameters

Page 133: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 125

Fig. 3 Flowchart of the Bacterial Memetic Algorithm

of this selected part are randomly changed in each clone (mutation). Next all theclones and the original bacterium are evaluated by an error criterion. The best indi-vidual transfers the mutated part into the other individuals. This cycle is repeated forthe remaining parts, until all parts of the bacterium have been mutated and tested.At the end the best rule base is kept and the remaining Nclones are discharged.

5.3 Levenberg-Marquardt Method

After the bacterial mutation step, the Levenberg-Marquardt algorithm (Marquardt,1963) is applied for each bacterium. In this step a minimization criterion has to beemployed also, which is related to the quality of the fitting. The training criterionthat will be employed is the usual Sum-of-Square-of-Errors (SSE):

� =

∥∥∥t − y∥∥∥2

2=

∥∥e [k]∥∥2

2(15)

where t stands for the target vector, y for the output vector, e for the error vector, and‖‖ denotes the 2-norm. It will be assumed that there are m patterns in the training set.

Page 134: Forging New Frontiers: Fuzzy Pioneers I

126 L.T. Kóczy et al.

The most commonly used method to minimize (15) is the Error-Back-Propagation(BP) algorithm, which is a steepest descent algorithm. The BP algorithm is a first-order method as it only uses derivatives of the first order. If no line-search is used,then it has no guarantee of convergence and the convergence rate obtained is usu-ally very slow. If a second-order method is to be employed, the best to use is theLevenberg-Marquardt (LM) algorithm (Marquardt, 1963), which explicitly exploitsthe underlying structure (sum-of-squares) of the optimization problem on hand.

Denoting by J the Jacobian matrix:

J [k] =[

∂y(x (p)) [k]

∂ par [k]

](16)

where the vector par contains all membership functions’ parameters (all breakpointsin the membership functions), and k is the iteration variable. The new parametervalues can be obtained by the update rule of the LM method:

par [k + 1] = par [k]−[

J T [k] J [k]+ α I]−1

J T [k] e [k] (17)

In (17), α is a regularization parameter, which controls the both the search directionand the magnitude of the update. The search direction varies between the Gauss-Newton direction and the steepest direction, according to the value of α. This isdependent on how well the actual criterion agrees with a quadratic function, in aparticular neighborhood. The good results presented by the LM method (comparedwith other second-order methods such as the quasi-Newton and conjugate gradi-ent methods) are due to the explicit exploitation of the underlying characteristicsof the optimization problem (a sum-of-square of errors) by the training algorithm.The Jacobian matrix with respect to the parameters in the bacterium must be com-puted. This can be done on a pattern by pattern basis (Botzheim et al., 2004; Ruanoet al., 2001).

Jacobian computation

Because trapezoidal membership functions are used, and each trapezoid has fourparameters, thus the relative importance of the j th fuzzy variable in the i th rule is:

μi j(x j

) = x j − ai j

bi j − ai jNi, j,1(x j )+ Ni, j,2(x j )+ di j − x j

di j − ci jNi, j,3(x j ) (18)

where ai j ≤ bi j ≤ ci j ≤ di j must hold and:

Page 135: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 127

Ni, j,1(x j ) ={

1, i f x j ∈[ai j , bi j

]

0, i f x j /∈ [ai j , bi j

] (19)

Ni, j,2(x j ) ={

1, i f x j ∈(bi j , ci j

)

0, i f x j /∈ (bi j , ci j

)

Ni, j,3(x j ) ={

1, i f x j ∈[ci j , di j

]

0, i f x j /∈ [ci j , di j

]

The activation degree of the i th rule (the t-norm is the minimum):

wi =n

minj=1

μi j (x j ) (20)

where n is the number of input dimensions. wi is the importance of the i th rule ifthe input vector is x and μi, j (x j ) is the j th membership function in the i th rule. Thei th output is being cut in the height wi . Then defuzzification method is calculatedwith the COG output:

y(x) =

R∑i=1

∫y∈suppμi (y)

yμi (y)dy

R∑i=1

∫y∈suppμi (y)

μi (y)dy

(21)

where R is the number of rules. If this defuzzification method is used, the integralscan be easily computed. In (21) y(x) will be the following:

y(x) = 1

3

R∑i=1

(Ci+Di+Ei )

R∑i=1

2wi (di−ai )+w2i (ci+ai−di−bi )

Ci = 3wi (d2i − a2

i )(1− wi )

Di = 3w2i (ci di − ai bi )

Ei = w3i (ci − di + ai − bi )(ci − di − ai + bi )

(22)

Then, the Jacobian matrix can be written as:

J =[

∂y(x (p))

∂a11

∂y(x (p))

∂b11· · · ∂y(x (p))

∂a12· · · ∂y(x (p))

∂d1· · · ∂y(x (p))

∂dR

](23)

where

Page 136: Forging New Frontiers: Fuzzy Pioneers I

128 L.T. Kóczy et al.

∂y(x (p))

∂ai j= ∂y

∂wi

∂wi

∂μi j

∂μi j

∂ai j(24)

∂y(x (p))

∂bi j= ∂y

∂wi

∂wi

∂μi j

∂μi j

∂bi j

∂y(x (p))

∂ci j= ∂y

∂wi

∂wi

∂μi j

∂μi j

∂ci j

∂y(x (p))

∂di j= ∂y

∂wi

∂wi

∂μi j

∂μi j

∂di j

From (20), wi depends on the membership functions, and each membership functiondepends only on four parameters (breakpoints). So, the derivatives of wi are

∂wi

∂μi j=

⎧⎨⎩

1, i f μi j =n

mink=1

μik

0, otherwi se(25)

and the derivatives of the membership functions will be

∂μi j

∂ai j=

x (p)j − bi j

(bi j − ai j )2 Ni, j,1(x (p)j ) (26)

∂μi j

∂bi j=

ai j − x (p)j

(bi j − ai j )2Ni, j,1(x (p)

j )

∂μi j

∂ci j=

di j − x (p)

j

(di j − ci j )2 Ni, j,3(x (p)

j )

∂μi j

∂di j=

x (p)j − ci j

(di j − ci j )2 Ni, j,3(x (p)j )

∂y∂wi

and the derivatives of the output membership functions parameters have tobe also computed. From (22):

∂y

∂∗i∗= 1

3

den ∂Fi∗∂∗i∗ − num ∂Gi∗

∂∗i∗

(den)2 (27)

where ∗i = wi , ai , bi , ci , di , den and num are the denominator and the numeratorof (22), respectively Fi∗ and Gi∗ are the i∗ member of the sum in the numerator andthe denominator. The derivatives will be as follows:

Page 137: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 129

∂ Fi

∂wi= 3(d2

i − a2i )(1− 2wi )+ 6wi (ci di − ai bi ) (28)

+ 3w2i [ (ci − di )

2 − (ai − bi )2 ]

∂Gi

∂wi= 2(di − ai)+ 2wi (ci + ai − di − bi )

∂ Fi

∂ai= −6wi ai + 6w2

i ai − 3w2i bi − 2w3

i (ai − bi ) (29)

∂Gi

∂ai= −2wi +w2

i

∂ Fi

∂bi= −3w2

i ai + 2w3i (ai − bi )

∂Gi

∂bi= −w2

i

∂ Fi

∂ci= 3w2

i di − 2w3i (di − ci )

∂Gi

∂ci= w2

i

∂ Fi

∂di= 6wi di − 6w2

i di + 3w2i ci + 2w3

i (di − ci )

∂Gi

∂di= 2wi −w2

i

5.4 Gene Transfer

The gene transfer operation allows the recombination of genetic informationbetween two bacteria. First the population must be divided into two halves. Thebetter bacteria are called the superior half, the other bacteria are called the inferiorhalf. One bacterium is randomly chosen from the superior half, this will be thesource bacterium, and another is randomly chosen from the inferior half, this willbe the destination bacterium. A part (e.g. a rule) from the source bacterium is chosenand this part will overwrite a rule of the destination bacterium. This cycle is repeatedfor Nin f times, where Nin f is the number of “infections” per generation.

5.5 Stop Condition

If the population satisfies a stop condition or the maximum number of generationNgen is reached then the algorithm ends, otherwise it returns to the bacterial muta-tion step.

Page 138: Forging New Frontiers: Fuzzy Pioneers I

130 L.T. Kóczy et al.

6 Conclusion

Fuzzy rule base reduction methods and fuzzy modeling techniques were introducedin this paper. We discussed how the complexity of a fuzzy rule base can be re-duced. The classical fuzzy models deal with dense rule bases where the universeof discourse is fully covered. The reduction of the number of linguistic terms ineach dimension leads to sparse rule bases and rule interpolation. By decreasingthe dimension of the sub-rule bases by using meta-levels or hierarchical fuzzy rulebases the complexity can also be reduced. It is also possible to apply interpolationin hierarchical rule bases.

The paper discussed automatic methods to determine the fuzzy rule base. Af-ter an overview of some clustering methods, the bacterial memetic algorithm wasintroduced. This approach combines the bacterial evolutionary algorithm and a gra-dient based learning method, namely the Levenberg-Marquardt procedure. By thiscombination the advantages of both methods can be exploited.

Acknowledgments Supported by the Széchenyi University Main Research Direction Grant 2005,a National Scientific Research Fund Grant OTKA T048832 and the Australian Research Council.

Special acknowledgements to Prof. Antonio Ruano and Cristiano Cabrita (University ofAlgarve, Faro, Portugal) and to Dr. Alex Chong (formerly with Murdoch University, Perth,Australia).

References

P. Baranyi, D. Tikk, Y. Yam, and L. T. Kóczy: “lnvestigation of a new alpha-cut Based Fuzzylnterpolation Method”, Tech. Rep., Dept. Of Mechanical and Automation Engineering, TheChinese University of Hong Kong, 1999.

J. C. Bezdek, Recognition with Fuzzy Objective Function Algorithms, New York: Plenum Press,1981.

J. Botzheim, B. Hámori, L. T. Kóczy, and A. E. Ruano, “Bacterial algorithm applied for fuzzyrule extraction”, in Proceedings of the International Conference on Information Processingand Management of Uncertainty in Knowledge-based Systems, IPMU 2002, pp. 1021–1026,Annecy, France, 2002.

J. Botzheim, C. Cabrita, L. T. Kóczy, and A. E. Ruano, “Estimating Fuzzy Membership FunctionsParameters by the Levenberg-Marquardt Algorithm”, in Proceedings of the IEEE InternationalConference on Fuzzy Systems, FUZZ-IEEE 2004, pp. 1667–1672, Budapest, Hungary, 2004.

J. Botzheim, C. Cabrita, L. T. Kóczy, and A. E. Ruano, “Fuzzy rule extraction by bacterial memeticalgorithm”, in Proceedings of the 11th World Congress of International Fuzzy Systems Associ-ation, IFSA 2005, pp. 1563–1568, Beijing, China, 2005.

A. Chong, T. D. Gedeon, and L. T. Kóczy, “Projection Based Method for Sparse Fuzzy SystemGeneration”, in Proceedings of 2nd WSEAS International Conference on Scientific Computa-tion and Soft Computing, pp. 321–325, Crete, 2002.

Y. Fukuyama and M. Sugeno, “A new method of choosing the number of clusters for fuzzy c-meansmethod” in Proceedings of the 5th Fuzzy System Symposium, 1989.

T. D. Gedeon and L. T. Kóczy, “Conservation of fuzziness in rule interpolation”, Intelligent Tech-nologies, vol. 1 International Symposium on New Trends in Control of Large Scale Systems,Herlany, 13–19, 1996.

T. D. Gedeon, P. M. Wong, Y. Huang, and C. Chan, “Two Dimensional Fuzzy-Neural Interpolationfor Spatial Data” in Proceedings Geoinformatics’97: Mapping the Future of the Asia-Pacific,Taiwan, vol. 1: 159–166, 1997.

Page 139: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Models and Interpolation 131

T. D. Gedeon, K. W. Wong, P. M. Wong, and Y. Huang, “Spatial Interpolation Using Fuzzy Rea-soning”, Transactions on Geographic Information Systems, 7(1), pp. 55–66, 2003.

J. Ihara, “Group method of data handling towards a modelling of complex systems – IV” Systemsand Control (in Japanese), 24, pp. 158–168, 1980.

L. T. Kóczy and K. Hirota, “Approximate reasoning by linear rule interpolation and general ap-proximation” International Journal of Approximate Reasoning, 9, pp. 197–225 (1993a).

L. T. Kóczy and K. Hirota, “Interpolative reasoning with insufficient evidence in sparse fuzzy rulebases” Information Sciences, 71, pp. 169–201 (1993b).

L. T. Kóczy and K. Hirota, “Interpolation in structured fuzzy rule bases” in Proceedings of theIEEE International Conference on Fuzzy Systems, FUZZ-IEEE’93, pp. 803-808, San Francisco(1993c).

E. H. Mamdani and S. Assilian, “An experiment in linguistic synthesis with a fuzzy logic con-troller,” Int. J. Man-Mach. Stud., vol. 7, pp. 1–13, 1975.

D. Marquardt, “An Algorithm for Least-Squares Estimation of Nonlinear Parameters”, J. Soc. In-dust. Appl. Math., 11, pp. 431–441, 1963.

N. E. Nawa and T. Furuhashi, “Fuzzy System Parameters Discovery by Bacterial EvolutionaryAlgorithm”, IEEE Tr. Fuzzy Systems vol. 7, pp. 608–616, 1999.

S. K. Pal, “Fuzzy set theoretic measure for automatic feature evaluation – II” Information Sciences,pp. 165–179, 1992.

A. E. Ruano, C. Cabrita, J. V. Oliveira, L. T. Kóczy, and D. Tikk, “Supervised Training Algorithmsfor B-Spline Neural Networks and Fuzzy Systems”, in Proceedings of the Joint 9th IFSA WorldCongress and 20th NAFIPS International Conference, Vancouver, Canada, 2001.

M. Sugeno and T. Yasukawa, “A fuzzy-logic-based approach to qualitative modelling”, IEEETransactions on Fuzzy Systems, 1(1): pp. 7–31, 1993.

M. Sugeno, T. Murofushi, J. Nishino, and H. Miwa, “Helicopter flight control based on fuzzylogic”, in Proceedings of IFES’91, Yokohama, 1991.

T. Takagi and M. Sugeno, “Fuzzy identification of systems and its applications to modeling andcontrol”, IEEE Trans. Syst., Man, Cybern., vol. SMC-15, pp. 116–132, 1985.

D. Tikk and T. D. Gedeon, “Feature ranking based on interclass separability for fuzzy control ap-plication”, in Proceedings of the International Conference on Artificial Intelligence in Scienceand Technology (AISAT’2000) pp. 29–32, Hobart, 2000.

D. Tikk, Gy. Biró, T. D. Gedeon, L. T. Kóczy, and J. D. Yang, “Improvements and Critique onSugeno’s and Yasukawa’s Qualitative Modeling”, IEEE Tr. on Fuzzy Systems, vol.10. no.5.,October 2002.

L. X. Wang and J. M. Mendel, “Generating fuzzy rules by learning from examples” IEEE Trans-actions on Systems, Man, and Cybernetics, 22(6): pp. 1414–1427, 1992.

K. W. Wong, C. C. Fung, and P. M. Wong, “A self-generating fuzzy rules inference systems forpetrophysical properties prediction” in Proceedings of IEEE International Conference on Intel-ligent Processing Systems, Beijing, 1997.

M. S. Yang and K. L. Wu, “A New Validity Index For Fuzzy Clustering” in Proceedings of theIEEE International Conference on Fuzzy Systems, pp. 89–92, Melbourne, 2001.

L. A. Zadeh, “Outline of a new approach to the analysis of complex systems,” IEEE Trans. Syst.,Man, Cybern., vol. SMC-1, pp. 28–44, 1973.

Page 140: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms

E. Trillas1, C. Moraga2, S. Guadarrama3, S. Cubillo and E. Castiñeira

Abstract This work tries to follow some agreements linguistic seem to have on thesemantical concept of antonym, and to model by means of a membership functionan antonym a P of a predicate P , whose use is known by a given μP .

Keywords: Antonym · Negate · Computing with Words · Semantics

1 Introduction

Nature shows a lot of geometrical symmetries, and it is today out of discussion thatthe mathematical concept of symmetry is of paramount importance for Nature’sscientific study. Language also shows the relevant symmetry known as Antonymythat, as far as the authors know, has not received too much attention form the pointof view of Classical Logic, perhaps because of the scarcely syntactical character ofthat Language’s phenomenon. However, and may be as Fuzzy Logic deals more thanClassical Logic with semantical aspects of Language, the concept of Antonym wasearly considered in Fuzzy Logic (see [1]) and originates some theoretical studies asit is the case of papers [4] and [3].

In any Dictionary an antonym of a word P is defined as one of its opposite wordsa P , and one can identify “opposite” as “reached by some symmetry on the universeof discourse X” on which the word P is used.

E. Trillas · C. Moraga · S. GuadarramaTechnical University of Madrid, Department of Artificial Intelligence

S. Cubillo · E. CastiñeiraTechnical University of Madrid, Department of Applied Mathematics, Madrid, Spain

1 Corresponding author: [email protected] work has been partially supported by CICYT (Spain) under project TIN2005-08943-C02-01

2 The work of C. Moraga has been supported by the Spanish State Secretary of Education andUniversities of the Ministry of Education, Culture and Sports (Grant SAB2000-0048) and by theSocial Fund of the European Community.

3 The work of S. Guadarrama has been supported by the Ministry of Education, Culture and Sportsof Spain (Grant AP2000-1964).

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 133C© Springer 2007

Page 141: Forging New Frontiers: Fuzzy Pioneers I

134 E. Trillas et al.

In [11] it was settled that the membership function μP of a fuzzy set in X la-belled P (equivalently, the compatibility function for the use on X of predicate P)is nothing else than a general measure on the set X (P) of atomic statements “x isP” for any x in X , provided that a preorder translating the secondary use of P on Xhas been established on X (P).

2 Fuzzy Sets

Let P be a predicate, proper name of a property, common name or linguistic la-bel. The use of P on a universe of objects X not only depends on both X and Pthemselves but also on the way P is related to X .

Three aspects of the use of a predicate P on X can be considered:

• Primary use, or the descriptive use of the predicate, expressed by the statements“x is P”, for x in X.

• Secondary use, or the comparative use of the predicate, expressed by the state-ments “x is less P than y”, for x, y in X.

• Tertiary use, or the quantitative use of the predicate, expressed by the statements“degree up to which x is P”, for x in X.

A fuzzy set labeled P on a universe of discourse X is nothing else than a functionμP : X → [0, 1]. On the other hand, when using P on X by means of a numericalcharacteristic of P , φP : X → S, the atomic statement “x is P” is made equiva-lent to a corresponding statement “φP(x) is P∗”; this is the case, for example, ofinterpreting “x is cold” as “temperature of x is low”, where low is predicated on aninterval S = [a, b] of numerical values a temperature can take.

At this point, it is not to be forgotten that six different sets are under considera-tion. The ground set X , the set X (P) of the statements “x is P”, the set of the valuesof a numerical characteristic φ, the set of the statements “φ(x) is P∗”, the partiallyordered set L of the degrees up to which “x is P”, and the partially ordered set L∗of the degrees up to which “φ(x) is P∗”. (See Fig. 1).

We will say that the use of P on X is R-measurable (see [11]) if there exists amapping φ : X → S with S a non-empty subset of R endowed with a pre-order≤P∗ and a ≤P∗-measure m on S such that:

1. φ is injective.2. φ(x) ≤P∗ φ(y) iff “x is less P than y”, for x,y in X.3. The degree up to which “x is P” is given by m(φ(x)) for each x in X.

After an experimental process, a mathematical model for the use of P on X isobtained. Of course the model can be reached thanks to the fact that P is used on Xby means of a numerical characteristic given by φ.

To measure, is, after all, to evaluate something that the elements of a given setexhibit, relatively to some characteristic they share. We do always measure concreteaspects of some elements; for example, the height but not the volume of a building,

Page 142: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 135

x

)(xφ

y

)( yφ

X

S

"" Pisx

"" Pisy

")(" *Pisxφ

")(" *Pisyφ

)(PX ),( PL ≤

)( *PX ),( **

PL ≤

"" xthanPlessisy

Fig. 1 Sets under consideration

the wine’s volume in barrels but not the wine’s alcoholic rank, the probability ofgetting an odd score when throwing a die but not the die’s size, and so on.

Hence, at the very beginning we consider a set X of objects sharing, possiblyamong others, a common characteristic K of which we know how it varies on X ,that is, for all x, y ∈ X we know whether it is or it is not the case that “x shows Kless than y shows K”. Therefore, we at least know the qualitative and perception’sbased relation �K on X × X defined as:

“x �K y if and only if x shows K less than y shows K”.

Since it does not seem too strong to assume that this relation verifies the reflexiveand transitive properties, we will consider that �K is a pre-order.

A fuzzy set P on a universe of discourse X is only known, as discussed earlier,once a membership function μP : X → [0, 1] is fixed, and provided that this func-tion reflects the meaning of the statements “x is P” for all x in X. This is the sameas to consider that μP translates the use of P on X, and in this sense μp is a modelof this use.

3 Antonyms

3.1 Linguistic Aspects of Antonyms

Antonymy is a phenomenon of language linking couples of words (P, Q) grammar-ians call pairs of antonyms, that seems very difficult to define and study in a formalway. Language is too complex to admit strictly formal definitions like it happensin other dominions after experimentation and mathematical modelling. Neverthe-less, there are some properties that different authors attribute to a pair of antonyms(see [5], [6], [7], [8]). These properties can be summarized in the following points:

Page 143: Forging New Frontiers: Fuzzy Pioneers I

136 E. Trillas et al.

1. In considering a pair (P, Q) of antonyms, Q is an antonym of P and P anantonym of Q.Then, from Q = a P , P = a Q it follows P = a(a P) and Q = a(a Q): in asingle pair of antonyms the relation established by (P, a P) is a symmetry.

2. For any object x such that the statements “x is P” and “x is a P” are meaningful,it is “If “x is a P”, then it is not “x is P” ”. The contrary “If it is not “x is P”,then “x is a P” ”, does not hold.A clear example is given by the pair (full, empty) when we say “If the bottle isempty, it is not full” but not “If the bottle is not full, it is empty”. For short, a Pimplies Not-P , but it is not the case that Not-P implies a P , except in someparticular situations on which it happens that (P, Not-P) is a pair of antonyms.That is, a P = Not-P happens but as a limiting case.

3. Pairs of antonyms (P, Q) are “disjoint”.This point is difficult to understand without any kind of formalization allowingto precisely know what grammarians understand by disjoint. Anyway, point (2)suggests that it means “contradiction”, a concept that, as it is well known, iscoincidental with that of “incompatibility” in Boolean Algebras, but not alwaysin other logical structures.

4. The more applicable is a predicate P to objects, the less applicable is a P to theseobjects, and the less applicable is P , the more applicable is a P .For example, with the pair (tall, short) and a subject x , the taller is x , the less isx short and the less tall is x , the shorter is she/he.

5. For a P being a “good” antonym of P , and among the objects x to which P anda P apply there should be a zone of neutrality, that is, a subset of X to whichneither P nor a P can be applied.

In what follows we will try to take advantage of points 1 to 5, by translating theseproperties to the compatibility functions μP and μa P to obtain the basic relationsthat a fuzzy set μa P should verify to be considered the antonym of a given fuzzyset μP , and to define some types of antonymy, synthesizing and extending the workthat was done in [10], [13] and [14].

3.2 Antonyms and Fuzzy Sets

Since symmetries will be used below, we specify them first in a general way. As wasexplained in Sect. 2, there are three sets on which symmetries are to be considered:the universe [a, b] of numerical values that the elements of the universe of discourseX may take due to the numerical characteristic of P , the unit interval [0, 1] and theset F([a, b]) of all fuzzy sets of [a, b]. Symmetries on [a, b] are just mappingsα : [a, b] → [a, b] such that α2 = id[a,b] and α(a) = b (and hence α(b) = a).Symmetries on [0, 1] are mappings N : [0, 1] → [0, 1] such that N2 = id[0,1]and N(0) = 1 (and hence N(1) = 0). And a symmetry on F([a, b]) should be amapping A : F([a, b])→ F([a, b]) such that A2 = idF([a,b]) and A(μ∅) = μ[a,b](and hence A(μ[a,b]) = μ∅ ).

Page 144: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 137

It is easy to prove that for each couple of symmetries (N, α) the combinedmapping N/α : F([a, b])→ F([a, b]), defined by (N/α)(μ) = N ◦μ◦α, is a sym-metry on F([a, b]) (see [13]). Furthermore, if N is a strong negation (that is, x ≤ yimplies N(y) ≤ N(x)), then N/α is an involution (that is, a dual automorphism ofperiod two) on F([a, b]). S. Ovchinnikov (see [4]) proved that any such involutionA : F([a, b])→ F([a, b]) is of the form A(μ)(x) = Nx (μ(α(x))) for a symmetryα on [a, b] and a family of dual automorphisms {Nx ; x ∈ [a, b]} on [0, 1] suchthat Nα(x) = N−1

x . Of course, if α = id[a,b], functions Nx are strong-negations.Negations N/α are not always functionally expressible, that is, it does not gener-

ally exist a symmetry N∗ : [0, 1]→ [0, 1] such that (N/α)(μ) = N∗(μ). For exam-ple, if X = [0, 1] with N = α = 1− id[0,1], it results (N/α)(μ)(x) = 1−μ(1− x),and (N/α)(μ)(0) = 1 − μ(1), and it is enough to take μ such that μ(1) = 1,μ(0) = 0 to have N/α(μ)(0) = 0, but N∗ ◦ μ(0) = N∗(μ(0)) = N∗(0) = 1.

Let us try to model point 4 of the former section. If μP and μa P are the com-patibility functions of the predicates P and a P in the universe of discourse X , theimplication “if 0 < μP (x) ≤ μP (y), then μa P(y) ≤ μa P(x)”, must be satisfied.Nevertheless, this condition would agree to the requirements when working withpredicates that, translated to their numerical characteristic in [a, b], are monotonic,that is, the preorder induced by the predicate P in [a, b] (see [11]) is a total order,but not in general; this is why a new point of view for point 4 of the former sectioncould be necessary.

3.3 Example 1

Given the predicate P = “close to 4” on the interval [0, 10], it should be a P = “farfrom 4”. But by taking the involution α given by α(x) = 10−x it results μa P = μP◦α, as it is shown in the Fig. 2, that obviously, does not correspond to “far form 4”.

Fig. 2 Solid line: Close to 4.Dotted line: Close to 6 0

1

1 2 3 4 75 6 8 9 10

P aP

Page 145: Forging New Frontiers: Fuzzy Pioneers I

138 E. Trillas et al.

Fig. 3 Solid line: Close to 4.Dotted line: Far from 4

0

1

1 2 3 4 75 6 8 9 10

P aP

If we consider α defined through the two intervals [0, 4], [4, 10], with α1 :[0, 4] → [0, 4], α1(x) = 4 − x , and α2 : [4, 10] → [4, 10], α2(x) = 14 − x ,we obtain μa P as shown in Fig. 3, that better reflects “far from 4”.

Since the use of a predicate P on X is here viewed as:

1. The ordering �P given in X is “x is less P than y” (the comparative use of P),and

2. A �P -measure μ(x is P) = μP (x) (measuring to what extent x is P),

It seems reasonable that the use of a P should be viewed as:

1. The ordering�a P=�−1P ,

2. A �a P-measure μ(x is a P) = μ(αP(x is P), or μa P = μP(αP (x)), with αP asymmetry (involution) for �−1

P =�a P .

As the ordering�P is obtained in practice from the function μP by looking at thesubintervals of [a, b] on which μP is either decreasing or non-decreasing, it resultsthat αP should be an involution (symmetry) at each one of them. Then, what is morenatural is to define α : [a, b]→ [a, b] by

α(x) =

⎧⎪⎪⎪⎨⎪⎪⎪⎩

α1(x), if x ∈ [a, a1]α2(x), if x ∈ [a1, a2]

......

αn(x), if x ∈ [an−1, b]

with involutions αi : [ai−1, ai ]→ [ai−i , ai ]Of course, what it is not obtained is that α reverses the order ≤, but it reverses

the order ≤P .Hence, to obtain a perfect antonym (see below, Definition 4.17) of P by using

μP the following steps should be observed:

1. Determine the order �P by looking at function μP .2. Consider the subintervals [ai−1, ai ].3. Construct an αi in each interval [ai−1, ai ].4. Obtain μa P(x) = μP(αi (x)), if x ∈ [ai−1, ai ].

Page 146: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 139

It is immediate that this is the algorithm used in the example with the subintervals[1, 4] and [4, 10].

3.4 Example 2

Let us consider the predicate P = “very close to 0 or close to 10” given in X =[0, 10] by the membership function in Fig. 4.

0

1

1 2 3 4 75 6 8 9 10

83

64

9

2.5 7.5

)(xPµ

μP (x) =⎧⎨⎩

(x−4)2

16 if 0 ≤ x ≤ 40 if 4 < x ≤ 6x−6

4 if 6 < x ≤ 10

Fig. 4 Representation of “very Close to 0 or Close to 10”

A possible way to find an antonym will be, as in [14], considering two strongnegations in two intervals of X , as for example α1 : [0, 5] → [0, 5] given byα1(x) = 5 − x and α2 : [5, 10] → [5, 10] given by α2(x) = 15 − x . So, thefuzzy set μa P(x) = μP (α(x)), where α(x) = α1(x) if x ∈ [0, 5] and α(x) = α2(x)

if x ∈ [5, 10], will beNow this fuzzy set can be understood as a membership function of the antonym

of P: “Very far from 0 and far from 10” (see fig. 5). Notice that it is μP ( 52 ) = 9

64 ≤μP( 15

2 ) = 38 and also μa P( 5

2 ) = 964 ≤ μa P( 15

2 ) = 38 .

This would lead us to the necessity of considering the condition “If μP(x) ≤μP(y) then μa P(x) ≥ μa P(y)” in each interval in which μP is monotonic, thatis, to consider the preorder induced by μP . As it was introduced in [11], if μP :X → [0, 1] is the membership function of a predicate P in X and, being [a, b] theinterval in which its numerical characteristic takes values, if there exists a partition{c0, c1, . . . cn} of this interval in such a way that functions

(μP ◦ (ΦP )−1

) |[ci−1,ci ]are monotonic, then a preorder (n-sharpened) can be defined in [a, b] in the follow-ing way: If r, s ∈ [a, b] it is r ≤μP s if and only if exists i ∈ {1, . . . , n} such tatr, s ∈ [ci−1, ci ] and μP (r) ≤ μP (s), being μP = μP ◦ (ΦP )−1. Furthermore, thepredicate P induces also a preorder in X : If x, y ∈ X it is defined

Page 147: Forging New Frontiers: Fuzzy Pioneers I

140 E. Trillas et al.

0

1

1 2 3 4 75 6 8 9 10

83

649

2.5 7.5

)(xaPµ

μa P (x) =

⎧⎪⎪⎨⎪⎪⎩

(1−x)2

16 if 1 < x ≤ 50 if 0 ≤ x ≤ 1

or 9 < x ≤ 109−x

4 if 5 < x ≤ 9

Fig. 5 Representation of “very Far from 0 and Far from 10”

x ≤P y ⇔ φP(x) ≤μP φP(y),

which we must take into account to formalize point 4.

4 Regular and Perfect Antonyms Pairs

Let X be a universe of discourse and P a predicate naming a gradual property of theelements of X , that is, that with each atomic statement “x is P” there is associateda number μP(x) ∈ [0, 1] reflecting the degree up to which x verifies the propertynamed by P .

To approach the concept of antonym based on the former five points, let us con-sider involutive mappings A : F(X)→ F(X), that is, verifying A2 = idF(X).

Definition 4.1 A couple (μ, A(μ)) is a pair of antonyms if

A.1. It exists a strong-negation Nμ such that A(μ) ≤ Nμ ◦ μ (that is, μ and A(μ)

are Nμ-contradictory fuzzy sets)A.2. If x ≤μ y, then y ≤A(μ) x

Of course, if the preorder induced by μ on X is a total order (that is, μ is mono-tonic), then the axiom A.2 turns into: “If 0 < μ(x) ≤ μ(y), then A(μ)(x) ≥A(μ)(y)”.

For the sake of simplicity, and if there is no confusion, let us write μa = A(μ),μ′ = Nμ ◦ μ. Of course, rules A.1 and A.2 do not facilitate a unique A(μ) for eachμ ∈ F(X) and some definitions should be introduced both to distinguish amongdiverse kinds of antonyms and to consider former points 1 to 5.

Page 148: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 141

Theorem 4.2 If (μ,μa) is a pair of antonyms:

a) μ(x) ≥ ε implies μa(x) ≤ Nμ(ε), and μa(x) ≥ ε implies μ(x) ≤ Nμ(ε), for allε ∈ [0, 1].

b) If μ(x) = 1, then μa(x) = 0. If μa(x) = 1, then μ(x) = 0.

c) Property A.1 is equivalent to μ ≤ Nμ ◦ μa .

d) Property A.1 is equivalent to the existence of a t-norm Wμ in the Łukasiewicz’sfamily such that Wμ ◦ (μ× μa) = μ∅. That is, the Wμ-intersection of μ and μa

is empty, or μ and μa are Wμ-disjoint.

Proof. (a) From ε ≤ μ(x), it follows Nμ(μ(x)) ≤ Nμ(ε) and μa(x) ≤ Nμ(ε),for any ε ∈ [0, 1]. (b) Take ε = 1 in (a). (c) From μa ≤ Nμ ◦ μ follows Nμ ◦Nμ ◦ μ = μ ≤ Nμ ◦ μa , and reciprocally. (d) As it is well known (see [2]), itexists a strictly increasing function gμ : [0, 1] → [0, 1] verifying gμ(0) = 0,gμ(1) = 1 and such that Nμ = g−1

μ ◦ (1 − id) ◦ gμ. Hence, A.1 can be writtenas μa(x) ≤ g−1

μ (1 − gμ(x))) or, equivalently, gμ(μa(x)) + gμ(μ(x)) − 1 ≤ 0.This inequality is equivalent to Max(0, gμ(μa(x)) + gμ(μ(x)) − 1) = 0, or to0 = g−1

μ (W (gμ(μa(x)), gμ(μ(x)))) = g−1μ ◦ W ◦ (gμ × gμ)(μ(x), μa(x)). Then,

with Wμ = g−1μ ◦ W ◦ (gμ × gμ), it is Wμ ◦ (μ × μa) = μ∅. The reciprocal is

immediate.

Definition 4.3 A pair of antonyms (μ,μa) is strict whenever μa� μ′, that is, if it

exists some x ∈ X such that μa(x) < Nμ(μ(x)). Otherwise, the pair is non-strictor complementary and μa = μ′.

Theorem 4.4 (μ,μa) is a non-strict pair of antonyms if and only if Sμ◦(μ×μa) =μX , with Sμ the t-conorm Nμ-dual of the t-norm Wμ in theorem 4.2.

Proof. If the pair (μ,μa) is non-strict, from μa = μ′ it follows Sμ ◦ (μ×μa) =Nμ ◦ Wμ ◦ (Nμ ◦ μ × Nμ ◦ μ′) = Nμ ◦ Wμ ◦ (μ′ × μ) = Nμ ◦ μ∅ = μX .Reciprocally, if Sμ ◦ (μ × μa) = μX , it follows Wμ ◦ (Nμ ◦ μ × Nμ ◦ μa) = μ∅and g−1

μ ◦W ◦ (gμ ◦ Nμ ◦ μ× gμ ◦ Nμ ◦ μa) = μ∅, or

W ◦ (1− gμ ◦ μ× 1− gμ ◦ μa) = μ∅,

that is, for any x ∈ X

Max(0, 1− gμ(μ(x))− gμ(μa(x))) = 0.

Then, 1− gμ(μ(x))− gμ(μa(x)) ≤ 0, and g−1μ (1− gμ(μ(x))) ≤ μa(x). Hence,

μ′(x) ≤ μa(x) and, by A.1, μ′ = μa .

Corollary 4.5 A pair of antonyms (μ,μa) is strict if and only if Sμ ◦ (μ × μa) =μX , with Sμ the t-conorm of theorem 4.4.

Remark If μ is a fuzzy set labelled P (μ = μP) and μa can be identified by alabel Q, then it is Q = a P and the words (P, Q) form a pair of antonym words. If

Page 149: Forging New Frontiers: Fuzzy Pioneers I

142 E. Trillas et al.

this pair is non-strict then a P ≡ ¬P and if it is strict a P is not coincidental with¬P . In general, for a pair of antonym words (P, a P) translated by means of theiruse in X in the fuzzy sets μp and μa P , respectively, it happens (theorem 4.2.d) thatthe intersection μP ∩ μa P obtained by the t-norm Wμ is empty: μP ∩ μa P = μ∅,that is, in this sense μP and μa P are disjoint fuzzy sets. Of course, this emptinessis equivalent to the contrariness of μP and μa P relatively to a strong negation NP

that allows to define μ¬P = NP ◦ μP (μa P ≤ NP ◦ μP , the version of A.1 withNμ = NP ). More again, the fuzzy sets μp and μa P are a partition of X on thetheory (F(X), TP , SP , NP ) (with TP = Wμ, SP = W∗μ and NP = Nμ, being W∗μthe t-conorm Nμ-dual of the t-norm Wμ, see theorem 4.2) if and only if a P ≡ ¬P(theorem 4.4), as only in this case is μP ∪ μa P = μX and μP ∩ μa P = μ∅.

Definition 4.6 Given a pair of antonyms (μ,μa) and a t-norm T , μm = T ◦ (μ′ ×(μa)′) = T ◦ (Nμ ◦ μ× Nμ ◦ μa), is the T -medium fuzzy set between μ and μa .

Theorem 4.7 For any t-norm T , μm is 1-normalized if and only if it exists x ∈ Xsuch that μ(x) = μa(x) = 0.

Proof. If μ(x) = μa(x) = 0, it is μm(x) = T (Nμ(0), Nμ(0)) = T (1, 1) = 1,and μm is 1-normalized at x . Reciprocally, if μm is 1-normalized at a point x , from1 = μm(x) = T (Nμ(μ(x)), Nμ(μa(x))) it follows 1 = Nμ(μ(x)) = Nμ(μa(x)),or μ(x) = μa(x) = 0, as T ≤ Min.

Hence, that μm be normalized is equivalent to {x ∈ X; μ(x) = μa(x) = 0} ={x ∈ X; μm(x) = 1} = ∅. Let us denote this set by N (μ,μa) and call it theNeutral Zone of X for the pair of antonyms (μ,μa).

Remark: Logical and Linguistic contradictionAs it is known (see [12]), a fuzzy set μP is selfcontradictory (with respect to the

considered negation) when μP ≤ μ¬P = μ′P = N ◦ μP . This concept is directlyderived form the logical implication “If P, then not P”, but in the language it is oftenused the contradiction as “If P, then aP”

Definition 4.8 A fuzzy set labelled P is linguistically selfcontradictory if μP ≤μa P for some antonym a P of P .

Of course, linguistic selfcontradiction implies logical selfcontradiction: Asμa P ≤ μ¬P , from μP ≤ μa P it follows μP ≤ μ¬P .

As it is also known (see [15]), in very general structures comprising all theoriesof fuzzy sets and, in particular, standard ones ([0, 1]X , T, S, N), it holds:

1. μP ∧ μ′P ≤ (μP ∧ μ′P)′, the principle of non-contradiction (NC).2. (μP ∨ μ′P )′ ≤ μP ∨ μ′P , the principle of excluded-middle (EM).

In this formulation, the principle NC says that μP ∧ μ′P is logically selfcon-tradictory, and the principle EM says that (μP ∨ μ′P )′ is also selfcontradictory as(μP ∨ μ′P )′ ≤ μP ∨ μ′P = ((μP ∨ μ′P)′)′. What about μP ∧ μa P and μP ∨ μa P

with a given antonym a P of P ?

Page 150: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 143

Theorem 4.9 For any antonym a P of P it is always P ∧ a P logicallyselfcontradictory (see [16]).

Proof. As μa P ≤ μ¬P it is μP ∧ μa P ≤ μP ∧ μ¬P . Hence, (μP ∧ μ¬P)′ ≤(μP ∧ μa P)′ and as it is always μP ∧ μ¬P ≤ (μP ∧ μ¬P)′ it follows thatμP ∧ μa P ≤ (μP ∧ μa P)′ . What still remains open is to know if μP ∧ μa P

is linguistically selfcontradictory.

Concerning the case of ¬(P ∨ a P) for any antonym a P of P , it is obvious that,provided it holds the duality law ¬(P ∨ Q) = ¬P ∨ ¬Q, it is logically selfcon-tradictory if and only if the medium term ¬P ∧ ¬a P is logically selfcontradictory.What still remains open is to know what happens without duality and to know whena(P ∨ a P) is linguistically selfcontradictory.

Example The fuzzy set (μP∨μa P)′ represented in Fig. 6 is not logically selfcontra-dictory, because assuming that T = Min, S = Max and N = 1 − id , the mediumterm μ′P ∧ μ′a P is normalized and, hence, is not logically selfcontradictory.

μ

1

0

aPP ¬∧¬μ

aPμ

Fig. 6 Medium term is not selfcontradictory

Assuming T = Prod , S = Prod∗ and N = 1 − id , the fuzzy set (μP ∨ μa P)′represented in Fig. 7 is however logically selfcontradictory, because the mediumterm μ′P ∧ μ′a P is below the fixed level of the negation N = 1 − id and hence islogically selfcontradictory.

Definition 4.10 A pair of antonyms (μ,μa) is regular whenever the Neutral ZoneN (μ,μa) is not empty.

Of course, a pair of antonyms (μ,μa) is regular if and only if for anyt-norm T the T -medium μm is 1-normalized and, hence, if and only if the Min-medium between μ and μa is 1-normalized. That is, if μm(x) = Min(Nμ(μ(x)),

Nμ(μa(x))) = 1 for some x ∈ X .

Theorem 4.11 Any regular pair of antonyms is strict.Proof. If N (μ,μa) = ∅, it exists x in X such that μ(x) = μa(x) = 0. Then

μ′(x) = Nμ(0) = 1, and 0 = μa(x) < 1 = μ′(x), or μa < μ′.

Page 151: Forging New Frontiers: Fuzzy Pioneers I

144 E. Trillas et al.

1

0

aPP ¬∧¬μ

aPμ

0.5

aP’μ P

’μ

Fig. 7 Medium term is selfcontradictory

Corollary 4.12 Any non-strict pair of antonyms is non-regular.

The set E(μ,μa) = {x ∈ X ; μ(x) = μa(x)} obviously verifies N (μ,μa) ⊂E(μ,μa). Then, if N (μ,μa) is non-empty it is E(μ,μa). Then, for any regularpair of antonyms it is E(μ,μa) = ∅.

Notice that, if μ = μP and μa = μa P the points x in E(μP , μa P) verify thatthe truth-values of the statements “x is P” and “x is a P” are equal. Then, points inE(μP , μa P) are those that can be considered both as P and as a P .

Theorem 4.13 For any pair of antonyms (μ,μa), it is E(μ,μa) ⊂ {x ∈ X ; μ(x) ≤g−1μ (1/2)}, with Nμ = g−1

μ ◦ (1− id[0,1])◦gμ, and g−1μ (1/2) ∈ (0, 1) the fixed point

of Nμ.Proof. Obviously, g−1

μ (1/2) ∈ (0, 1) is the fixed point of Nμ. If x ∈ E(μ,μa),from μ(x) = μa(x) ≤ μ′(x) it follows μ(x) ≤ g−1

μ (1 − gμ(μ(x))), that isequivalent to gμ(μ(x)) ≤ 1/2. The theorem obviously holds if E(μ,μa) = ∅.

Consequently, it is N (μ,μa) ⊂ E(μ,μa) ⊂ {x ∈ X ; μ(x) ≤ g−1μ (1/2)}.

Theorem 4.14 If μ is monotonic and E(μ,μa) = ∅, both functions μ and μa onlytake eventually in E(μ,μa) two values 0 and r , with 0 < r ≤ g−1

μ (1/2).Proof. If E(μ,μa)−N (μ,μa) = ∅ functions μ and μa are constantly equal to a

non-zero value on this difference-set. In fact, let it be x, y ∈ E(μ,μa)−N (μ,μa).It should be 0 < μ(x) = μa(x), 0 < μ(y) = μa(y). Let us suppose 0 < μ(x) ≤μ(y), then μa(y) ≤ μa(x) because of A.2. Hence:

μa(y) ≤ μa(x) = μ(x) ≤ μ(y) = μa(y),

and μ(x) = μ(y) = r , with r ≤ g−1μ (1/2). Of course, on N (μ,μa) it is μ(x) =

μa(x) = 0.

Theorem 4.15 If (μ,μa) is a non-strict pair of antonyms, E(μ,μa) = {x ∈X ; μ(x) = g−1

μ (1/2)}.

Page 152: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 145

Proof. If μa = μ′, E(μ,μa) = {x ∈ X ; μ(x) = Nμ(μ(x))} = {x ∈X ; μ(x) = g−1

μ (1 − gμ(μ(x)))} = {x ∈ X ; μ(x) = g−1μ (1/2)}. Of course, in

this case it is N(μ,μa ) = ∅.

Corollary 4.16 If (μ,μa) is a pair of antonyms such that E(μ,μa) = ∅ and thevalue r of μ verifies r < g−1

μ (1/2), the pair is strict.Proof. Obvious, as if the pair is non-strict by theorem 4.15 it should be r =

g−1μ (1/2).

A symmetry α : X → X can have or have not fixed points. For example, ina Boolean Algebra, α(x) = x ′ (the complement of x), is a symmetry as α2(x) =α(x ′) = x ′′ = x and α(0) = 0′ = 1. It is also and involution as if x ≤ y thenα(x) ≥ α(y); nevertheless, there is no a x such that α(x) = x ′ = x . If X is a closedinterval [a, b] of the real line any symmetry α : [a, b] → [a, b], such that x < yimplies α(x) > α(y) and α(a) = b (that is, any involution), has a single fixed pointα(z) = z ∈ (a, b).

Definition 4.17 A pair of antonyms (μ,μa) is perfect if it exists a symmetry αμ :[a, b]→ [a, b] such that μa = μ ◦ αμ.

Notice that, in this case, if μ is crisp, also μa is crisp.Of course, to have μaa = μ, as μaa = μa ◦ αμa = μ ◦ αμ ◦ αμa , it is a sufficient

condition that αμ ◦ αμa = idX , or αμ = αμa . In what follows this “uniformity”condition will be supposed.

Theorem 4.18 If αμ has some fixed point, a sufficient condition for a perfect pairof antonyms (μ,μa) with μa = μ◦αμ to be regular is that μ(z) = 0 for some fixedpoint z ∈ X of αμ.

Proof. As it is μa(z) = μ(αμ(z)) = μ(z) = 0, it follows z ∈ N (μ,μa).For example, in X = [1, 10] if μ(x) = x−1

9 and αμ(x) = 11− x (a symmetry of[1, 10] as α2

μ(x) = x , it is non increasing and verifies αμ(1) = 10), the definition

μa(x) = μ(αμ(x)) = μ(11− x) = 10− x

9,

gives the non Nμ-strict pair of antonyms (μ,μa) with Nμ(x) = 1 − x . In fact, itholds Nμ(μ(x)) = μa(x).

Let us consider a different kind of example in X = [1, 10]. Taking μ as:

μ(x) =⎧⎨⎩

1, if 1 ≤ x ≤ 45− x, if 4 ≤ x ≤ 50, if 5 ≤ x ≤ 10,

it is μ(x) > 0 if and only if x ∈ [1, 5). To have a regular pair of perfect antonyms(μ,μ ◦ αμ), it is sufficient but not necessary that 11− x < αμ(x), as then 5.5 < zprovided that αμ(z) = z.

Page 153: Forging New Frontiers: Fuzzy Pioneers I

146 E. Trillas et al.

Remarks1. If αμ has the fixed point z, from μa(z) = μ(αμ(z)) = μ(z), it is z ∈ E(μ,μa),and μ(z) > 0 implies μ(z) = r (theorem 4.14). Hence, if μ = μP , μa = μa P ,the fixed point z of αμ can be considered to be both P and a P . If this commontruth-value is not-zero, the pair of antonyms is not regular.

2. In the former first example in [1, 10] with μ(x) = x−19 , as it is μ(x) = 0 if x = 1,

to have a perfect pair of antonyms (μ,μ◦αμ) to which theorem 4.18 can be appliedit is necessary that αμ have the fixed point z = 1, and that both A.1, A.2 are verifiedby the pair (μ,μ ◦ αμ).

3. When the pair of antonyms is not perfect there is the serious technical difficultyof proving μaa = μ. This is a problem that with perfect antonymy is automaticallyobtained as the hypothesis αμ = αμa and α2

μ = id make it immediate.

4. Let us consider P=“less than three” on X = [1, 10]. The extension of P on X isgiven by the fuzzy (crisp) set

μP (x) ={

1, if x ∈ [1, 3)

0, if x ∈ [3, 10]

The negate ¬P of P gives the complement of the set μ−1P (1) = [1, 3), that is

μ−1¬P(1) = [3, 10], and μ¬P = 1 − μP with ¬P ≡ “greater or equal than three”.

Then NP (x) = 1 − x and, for any antonym a P of P it should be μa P ≤ 1 − μP ,and hence μ−1

a P(1) ⊂ [3, 10].Consequently, if μa P = μP ◦ αP is a perfect antonym of P , from:

μa P(x) = μP(αP (x)) ={

1, if αP (x) ∈ [1, 3)

0, if αP (x) ∈ [3, 10],

it follows μ−1a P(1) = (αP (3), 10] and it should be (αP (3), 10] ⊂ [3, 10]. Then

3 ≤ αP (3) and as μP(αP (3)) = 0 any of these perfect antonyms are regular.If 3 = αP (3), it is μ−1

a P(1) = (3, 10] and a P ≡ “greater than three”. If 3 < αP(3)

it is a P ≡ “greater than αP (3)” and if, for example, αP (x) = 1+ 10− x = 11− xit is μ−1

a P(1) = (8, 10] with a P ≡ “greater than eight”.Notice that as μm

P(x) = 1 if and only if μa P(x) = μP(x) = 0, it is (μmP )−1(1) =

[3, αP(3)], for any t-norm T .

4.1 The Case Probable-improbable in Boolean Algebras

After Kolmogorov axiomatics it is generally accepted in the field of mathematicsthat the predicate P = probable is used, on a boolean algebra (B, ·,+,′ ;≤; 0, 1),by means of the partial order of B (a ≤ b iff a · b = a) and some probabilityp : B → [0, 1]. That is, compatibility functions μP for the different uses of P onB are given by μP(x) = p(x) for all x in B: to fix a concrete use of P on B is

Page 154: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 147

to fix a probability p, and the degree up to which “x is P” is p(x). The ordering≤P that P induces on B is always taken as the partial ordering ≤, and hence μP is≤-measure (vid [11]) on B as x ≤ y implies p(x) ≤ p(y), and μP(0) = p(0) = 0,μP(1) = p(1) = 1.

Concerning the use of ¬P = not probable, as ≤−1P =≤−1 (the reverse ordering

of ≤: a ≤−1 b iff b ≤ a), and “x is not P” is identified with “not x is P” (“x’ isP”), from p(x ′) = 1 − p(x), it follows μ¬P(x) = μP (x ′) = 1 − p(x); a functionthat verifies:

1. μ¬P(1) = 0, with 1 the minimum of (B,≤−1)

2. μ¬P(0) = 1, with 0 the maximum of (B,≤−1)

3. If x ≤−1 y, then μ¬P(x) ≤ μ¬P(y),

and consequently it is a ≤−1-measure.What about the uses of a P =improbable, the antonym of P? Only considering

perfect antonyms a P of P, which are given by μa P(x) = μP (α(x)), with somemapping α : B → B such that:

1. α(0) = 1;2. If x ≤ y, then α(x) ≤−1 α(y);3. α2 = idB , and4. μa P(x) = μP(α(x)) ≤ μ¬P(x).

Hence, α(1) = α(α(0)) = 0, and p(α(x)) ≤ 1− p(x).It should be pointed out that neither 1 − p nor p ◦ α are probabilities, as

(1 − p)(0) = 1 and (p ◦ α)(0) = 1, but they are ≤−1-measures as, if x ≤ y, then(1−p)(x) ≤ (1−p)(y) and (p◦α)(x) ≤ (p◦α)(y), and (1−p)(1) = (p◦α)(1) = 0,(1− p)(0) = (p ◦ α)(0) = 1.

The symmetry α can be either a boolean function or not. If α is boolean, thatis, if α(x) = a · x + b · x ′ for some coefficients a, b in B , it is α2 = idB if andonly if b = a′, and then with α(0) = 1 it results α(x) = x ′. Hence, if α is booleanμa P(x) = μP (α(x)) = μP(x ′) = μ¬P(x), and there is no difference between a Pand ¬P . The booleanity of α corresponds to the non-strict use of a P (a use thatappears in some dictionaries of antonyms that give a(probable) = not-probable), butthe reciprocal is not-true as the following example will show:

Example Take B = 23 with atoms a1, a2, a3, and a probability p. Take α as:

• α(0) = 1, α(1) = 0.• α(a′1) = a2, α(a′2) = a3, α(a′3) = a1.• α(a2) = a′1, α(a3) = a′2, α(a1) = a′3.

It is α2 = idB , α(0) = 1 and if x ≤ y, then α(x) ≥ α(y), but α is non-booleanas α(x) = x ′ (for example, α(a1) = a′3 = a′1). Then:

• μa P(0) = p(1) = 1, μa P(1) = p(0) = 0• μa P(a1) = p(a′3) = p(a1) + p(a2); μa P(a2) = p(a′1) = p(a2) + p(a3);

μa P(a3) = p(a′2) = p(a1)+ p(a3)

• μa P(a′1) = p(a2); μa P(a′2) = p(a3); μa P(a′3) = p(a1)

Page 155: Forging New Frontiers: Fuzzy Pioneers I

148 E. Trillas et al.

An easy computation shows that μa P ≤ μ¬P whenever p(a1) = p(a2) withp(a3) = 1− p(a1)− p(a2) = 1−2 p(a1). With p(a1) = p(a2) = 1

4 , p(a3) = 12 it is,

for example, μa P(a1) = 12 < 3

4 = 1− p(a1) = μ¬P(a1), hence: μa P < μ¬P , andthe corresponding use of “improbable” is a strict and regular antonym of “probable”.

4.2 A Complex Example

An especial example could be the case of the predicate “Organized” over the orga-nization grade of desks and its antonym “Disorganized”, because a “non-organized”desk does not mean a “disorganized” desk, and also, a “non-disorganized” desk doesnot mean an “organized” desk.

Moreover, we assume that while the organization of the desk can decrease andaccordingly the grade to which it is organized, it may not yet be disorganized, unlessuntil a certain weak level of organization is reached. And also in the inverse way, thedisorganization of the desk can decrease and so the grade to which it is disorganized,it may not yet be organized, unless until a certain weak level of disorganization isreached.

A possible representation of this predicates can be seen in Fig. 8.

x0

1

x1 x2 x3 x4 x5 xn

OrganizedDisorganized

0

21

M

)(xPμ)(xaPμ

Non-Organized Non-Disorganized

Fig. 8 Organized versus Disorganized, Non-Organized and Non-Disorganized

Notice that, α(x) = (xn − x) for all x ∈ [x0, xn]. Let μa P = μP ◦ α. It follows:

• (μP , μa P) satisfies Definition 4.1 and is a pair of antonyms, since:

μa P ≤ N ◦ μP where N = 1− id

∀xi , x j ∈ [x0, xn] : 0 < μP(xi ) ≤ μP(x j )⇒ μa P(xi ) ≥ μa P(x j )

• (μP , μa P) satisfies Definition 4.3, hence it is strict:

∀x ∈ (x1, x5) : μa P(x) < N(μP (x))

Page 156: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 149

• μP satisfies Theorem 4.2

∀x ∈ [x0, xn] : μP(x) ≤ N(μa P (x))

• (μP , μa P) satisfies Theorem 4.2

∀x ∈ [x0, xn] : W (μP (x), μa P(x)) = max(0, μP(x)+ μa P(x)− 1) = 0

• (μP , μa P) satisfies Corollary 4.5

∀x ∈ [x0, xn] : W∗(μP(x), μa P(x)) = μX

It is fairly obvious that for x ∈ (x1, x2] and x ∈ [x4, x5)

μP(x)+ μa P(x) < 1

• The polygon (x0, x1, M, x5, xn) represents the fuzzy set T−medium, with T =Min (see Definition 4.6). This T−medium (or more properly Min−medium) isnon-normalized (See Theorem 4.7)

• The neutral zone N (μP , μa P) = ∅, from where this pair of antonyms is non-regular (see Definition 4.9), however is perfect (see Definition 4.17).

• The set E(μp, μa P) = {x3} and μP (x3) < 12 (see Corollary 4.12, Theorem 4.13

and Corollary 4.16).

5 On Two Controversial Points

Using the model introduced for perfect pairs of antonyms let us briefly consider twocontroversial issues concerning antonymy. First, the relation between the predicates¬(a P) and a(¬P) derived from P , and second the relation between a(P ∧ Q),a(P ∨ Q), a P and a Q. Notice that Example 2 in 3.4 furnishes a case in whicha(P ∨ Q) can be interpreted as a P ∧ a Q, a property of duality that actually doesnot seems to be general.

Concerning the first issue, if both predicates¬(a P) and a(¬P) are used on X insuch a way that they can be respectively represented by the fuzzy sets μ¬(a P) andμa(¬P), it is:

μ¬(a P) = Na P ◦ μa P = Na P ◦ μP ◦ αP ,

and

μa(¬P) = μ¬P ◦ α¬P = NP ◦ μP ◦ α¬P .

Page 157: Forging New Frontiers: Fuzzy Pioneers I

150 E. Trillas et al.

Hence, in general it is not true that μ¬(a P) = μa(¬P), that is, predicates¬(a P) and a(¬P) are not generally synonyms on a universe X : they cannot beidentified. Nevertheless, it is obvious that a sufficient condition for this identifi-cation is that Na P = NP and αP = α¬P . This “uniformity” conditions seemfar to be always attainable and perhaps only in very restricted contexts can beaccepted.

Whenever ¬(a P) ≡ a(¬P) both predicates are synonyms on X , but it isnot usual to have a single word to designate this predicate. For example, thereseems not to exist English words designating neither a(Not Young) nor a(NotProbable).

It should be pointed out that also if Na P = NP from μP ◦ α = μP ◦ α¬P itdoes not generally follow that αP = α¬P except, of course, if μP is an injectivefunction.

Concerning the second issue and provided that μP∧Q is given, in any standardFuzzy Set Theory, by T ◦ (μP × μQ) for some t-norm T , it is:

μa(P∧Q) = μP∧Q ◦ αP∧Q = T ◦ (μP × μQ) ◦ αP∧Q

= T ◦ (μP ◦ αP∧Q × μQ ◦ αP∧Q), and

μa P∧aQ = T ◦ (μa P × μaQ)

= T ◦ (μP ◦ αP × μQ ◦ αQ).

Hence, in general it is not μa(P∧Q) = μa P∧aQ . But a sufficient condition fora(P ∧ Q) and a P ∧ a Q being synonyms is that

αP∧Q = αP = αQ ,

a case on which a(P ∧ Q) can be identified with a P ∧ a Q. On this uniformityconditions it is again a(P ∨ Q) identifiable with a P ∨ a Q, as it easy to analogouslyproof it if, for some t-conorm S, it is μP∨Q = S ◦ (μP × μQ).

It is notwithstanding to be noticed that the conjunction of properties a(a P) ≡ P(a general one) and a(P ∧ Q) ≡ a P ∧ a Q (only valid on a restricted situation)results in a contradiction when, with the t-norm of theorem 4.2.d, is μP∧a P = μ∅.In fact, if all that turns to happen it is μ∅ = μP∧a P = μa(a P)∧a P = μa(a P∧P), andit follows the absurd a(empty)≡ empty. Of course, if P ∧ a P does not represent theempty set (only proved for a special t-norm) that contradiction can not appear in thesame way.

The relationship between the involution A and the logical connectives¬,∧ and∨does not always result in general laws because a P strictly depends on the language:a P only exists actually if there is a name for it in the corresponding language. Forexample, in a restricted family domain on which “young and rich” is made synonymof “interesting” it could perfectly be a(interesting) ≡ “old and poor” ≡ a(young)and a(rich).

Page 158: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 151

6 Conclusions

6.1 Antonyms and symetries

If with the negate ¬P of a predicate P it is not always the case that statements “xis ¬¬P” and ”x is P” are equivalent, the situation is different with the antonym a Pas it seems to be generally accepted the law a(a P) ≡ P , for a given antonym a Pof P . But the antonym does not seems definable without appealing to the way ofnegating the predicate: the inequality μa P ≤ μ¬P should be respected and it shows¬P as a limit case for any possible antonym of P .

As long as languages are phenomena of big complexity not fully coincidentalwith which one finds in dictionaries and Grammar Textbooks, but constantly createdby people using it, a lot of experimentation should be done in order to better know onwhich (possibly restricted) domains mathematical models arising from theoreticalFuzzy Logic can actually be taken as enough good models for language.

Antonymy is in language and, if classical logics cannot possibly cope with it,fuzzy logic not only can but deals with it early from its inception. That an antonymis a non-static concept of language and perhaps not of any kind of syntactical logicis shown by the fact that “John is old” is used thanks to the previous use, for ex-ample, of “Joe is young”: antonymy comes from direct observation of the objectsin the universe of discourse. This leads to a certain empirical justification for mak-ing sometimes equivalent both statements “x is a P” and “αP (x)” is P”, for someadequate symmetry αP : X → X .

Possibly it is the consideration of a concrete ground set X and of how predicatesare used in it that makes the difference between how an abstract science, like FormalLogic, approaches language and how it is done by an empirical one like FuzzyLogic.

6.2 A Final Complex Example

Consider the pair of antonyms (“closed”, “open”) in the following scenario. A dooris “closed” if neither light nor a person can get through. The door is “slightly open”if light and sound, but not a person can get through. Finally the door is “open”if a person can get through. Let the universe of discourse be [a, b] = [0, 90o]. Apossible representation is shown in Fig. 9.

It becomes apparent that “closed” implies “not open” and “open” imples “notclosed”; moreover, the core of “slightly open” represents a neutral zone between“closed” and “open”. That is, “closed” and “open” satisfy the basic requirements toconstitute a pair of antonyms. However no symmetry seems to relate them.

Let ϕ : [a, b] → [a, b] be an (appropriate) order isomorphism. Then given apredicate P its antonym a P may be obtained as follows:

• With μP(x) build μP (x) = μP(ϕ(x))

• With μP(x) and the symmetry α(ϕ(x)) = (b + a − ϕ(x)), build μa P(x)

• Obtain μa P(x) with μa P(ϕ−1(x))

Page 159: Forging New Frontiers: Fuzzy Pioneers I

152 E. Trillas et al.

Fig. 9 Representation of“closed”, “slightly open” and“open”

0 90

closedslightlyopen open

For the example above, one possible order isomorphism ϕ is shown in Fig. 10.In all former examples, the order isomorphism implicitly used was obviously theidentity. The important open question remains how to find an appropriate orderisomorphism which will allow the use of a simple symmetry to relate antonyms.

6.3 Final remarks

This paper, again as a continuation of the former essays [13], [11] and specially [14],is nothing else than a first approach at how fuzzy methodology can help to seemore clearly across the fog of language by an attempt to modelling the concept ofantonym, a concept that is far from being enough known for operational purposes inComputer Science. A concept that plays a relevant role both in concept-formationand in meaning-representation, as well as in the interpretation of written texts orutterances.

That methodology can also help linguistics as a useful representation for thestudy of antonym. For example, to clarify in which cases it is either a(P ∧ Q) =a P ∨ a Q or a(P ∧ Q) = a P ∧ a Q, or something else.

b a

closed(x) slightlyopen open(x)

a b

ϕ

closed(ϕ(x)) open(ϕ(x))

Fig. 10 An isomorphism ϕ for example

Page 160: Forging New Frontiers: Fuzzy Pioneers I

Computing with Antonyms 153

References

1. Zadeh, L. (1975) “The concept of a linguistic variable and its application to approximatereasoning” I, II, III. Information Sciences, 8, 199–251; 8, 301–357; 9, 43–80.

2. Trillas, E. (1979) “Sobre funciones de negación en la teoría de conjuntos difusos”. Stochastica,Vol.III, 1, 47–60. Reprinted, in English (1998), “On negation functions in fuzzy set theory”Advanges in Fuzzy Logic, Barro, S., Bugarín, A., Sobrino, A. editors , 31–45.

3. Trillas, E. and Riera, T. (1981) “Towards a representation of synonyms and antonyms by fuzzysets”. Busefal, V, 30–45.

4. Ovchinnikov, S.V. (1981) “Representations of Synonymy and Antonymy by Automorphismsin Fuzzy Set Theory”. Stochastica, 2, 95–107.

5. Warczyk, R. (1981) “Antonymie, Négation ou Opposition?”. La Linguístique, 17/1, 29–48.6. Lehrer, A. and Lehrer, K. (1982) “Antonymy”. Linguistics and Philosophy, 5, 483–501.7. Kerbrat-Orecchioni, C. (1984) “Antonymie et Argumentation: La Contradiction”, Pratiques,

43, 46–58.8. Lehrer, A. (1985) “Markedness and antonymy”. J. Linguistics, 21, 397–429.9. Trillas E. (1998) “Apuntes para un ensayo sobre qué ’objetos’ pueden ser los conjuntos bor-

rosos”. Proceedings of ESTYLF’1998. 31–39.10. De Soto, A.R. and Trillas, E. (1999) “On Antonym and Negate in Fuzzy Logic”. Int. Jour. of

Intelligent Systems 14, 295–303.11. Trillas, E. and Alsina, C. (1999) “A Reflection on what is a Membership Function”. Mathware

& Soft-Computing, VI, 2–3, 219–234.12. Trillas, E., Alsina, C. and Jacas, J. (1999) “On contradiction in Fuzzy Logic”. Soft Computing,

Vol 3, 4, 197–199.13. Trillas, E. and Cubillo, S. (2000) “On a Type of Antonymy in F([a, b])”. Proceedings of

IPMU’2000, Vol III, 1728–1734.14. Trillas, E., Cubillo, S. and Castiñeira, E. (2000) “ On Antonymy from Fuzzy Logic”. Proceed-

ings of ESTYLF’2000, 79–84.15. Trillas, E., Alsina, C. and Pradera, A. (to appear) “Searching for the roots of Non-

Contradiction and Excluded-Middle” International Journal of General Systems.16. Guadarrama, S., Trillas, E. and Renedo, E. (to appear) “Non-Contradiction and Excluded-

Middle with antonyms” Proceeding of ESTYLF’2002.

Page 161: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation

Germano Resconi and Masoud Nikravesh

Abstract In this paper, we introduce a new type of computation called “MorphicComputing”. Morphic Computing is based on Field Theory and more specificallyMorphic Fields. We claim that Morphic Computing is a natural extension of Holo-graphic Computation, Quantum Computation, Soft Computing, and DNA Comput-ing. All natural computations bonded by the Turing Machine can be formalised andextended by our new type of computation model – Morphic Computing. In thispaper, we introduce the basis for this new computing paradigm.

Keywords: Morphic computing · Morphogenetic computing; Morphic fields mor-phogenetic fields · Quantum computing · DNA computing · Soft computing · Com-puting with words · Morphic systems · Morphic neural network · Morphic systemof systems · Optical computation by holograms · Holistic systems

1 Introduction

In this paper, we introduce a new type of computation called “Morphic Computing”[Resconi and Nikravesh, 2006, 2007a, and 2007b]. We assume that computing isnot always related to symbolic entities such as numbers, words or other symbolicentities. Fields as entities are more complex than any symbolic representation of theknowledge. For example, Morphic Fields include the universal database for bothorganic (living) and abstract (mental) forms. Morphic Computing can also changeor compute non physical conceptual fields. The basis for Morphic Computing isField Theory and more specifically Morphic Fields. Morphic Fields were first in-troduced by Rupert Sheldrake [1981] from his hypothesis of formative causation

Germano ResconiCatholic University, Brescia, Italye-mail [email protected]

Masoud NikraveshBISC Program, EECS Department, University of California, Berkeley, CA 94720, USe-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 155C© Springer 2007

Page 162: Forging New Frontiers: Fuzzy Pioneers I

156 G. Resconi, M. Nikravesh

[Sheldrake, 1981 and 1988] that made use of the older notion of MorphogeneticFields. Rupert Sheldrake developed his famous theory, Morphic Resonance [Shel-drake 1981 and 1988], on the basis of the work by French philosopher Henri Berg-son [1896, 1911]. Morphic Fields and its subset Morphogenetic Fields have been atthe center of controversy for many years in mainstream science. The hypothesis isnot accepted by some scientists who consider it a pseudoscience.

“The term [Morphic Fields] is more general in its meaning than Morphogenetic Fields,and includes other kinds of organizing fields in addition to those of morphogenesis; theorganizing fields of animal and human behaviour, of social and cultural systems, and ofmental activity can all be regarded as Morphic Fields which contain an inherent memory.” –Sheldrake [1988].

We claim that Morphic Fields reshape multidimensional space to generate localcontexts. For example, gravitational fields in general relativity, are Morphic Fieldsthat reshape space-time space to generate local context where particles move. Mor-phic Computing reverses the ordinary N input basis fields of possible data to onefield in the output system. The set of input fields form an N dimension space orcontext. The N dimensional space can be obtained by a deformation an Euclideanspace and we argue that the Morphic Fields are the cause of the deformation. In linewith Rupert Sheldrake [1981], our Morphic Fields are the formative causation ofthe context.

In our approach, the output field of data is the input field X in Morphic Com-puting. To compute the coherence of the X with the context, we project X into thecontext. Our computation is similar to a quantum measure where the aim is to findcoherence between the behaviour of the particles and the instruments. Therefore, thequantum measure projects the physical phenomena into the instrument as a context.The logic of our projection is the same as quantum logic in the quantum measure.In conclusion, we compute how a context can implement desired results based onMorphic Computing.

Gabor [1972] and Fatmi and Resconi [1988] discovered the possibility of com-puting images made by a huge number of points as output from objects as a setof huge number of points as input by reference beams or laser (holography). It isalso known that a set of particles can have a huge number of possible states that inclassical physics separate one from the other. However, only one state (position andvelocity of the particles) would be possible at the same time. With respect to quan-tum mechanics, one can have a superposition of all states with all states presented inthe superposition at the same time. It is also very important to note that at the sametime one cannot separate the states as individual entities but consider them as oneentity. For this very peculiar property of quantum mechanics, one can change all thesuperpose states at the same time. This type of global computation is the conceptualprinciple by which we think one can build quantum computers. Similar phenomenacan be used to develop DNA computation where a huge number of DNA as a field ofDNA elements are transformed (replication) at the same time and filtered (selection)to solve non polynomial problems. In addition, soft-computing or computation bywords extend the classical local definition of true and false value for logic predicateto a field of degree of true and false inside the space of all possible values of the

Page 163: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 157

predicates. In this way, the computational power of soft computing is extendedsimilar to that which one can find in quantum computing, DNA computing, andoptical computing by holography. In conclusion, one can expect that all the previousapproaches and models of computing are examples of a more general computationmodel called “Morphic Computing” where the “Morphic” means “form” and is as-sociated with the idea of holism, geometry, field, superposition, globality and so on.

We claim that Morphic Computing is a natural extension of Optical Computationby holograms [Gabor 1972] and holistic systems [Smuts 1926] however in form ofcomplex System of Systems [Kotov 1997], Quantum Computation [Deutsch 1985,Feynman, 1982, Omnès 1994, Nielsen and Chuang 2000], Soft Computing [Zadeh1991], and DNA Computing [Adleman 1994]. All natural computation bonded bythe Turing Machine [Turing 1936–7 and 1950] such as classical logic and AI [ Mc-Carthy et al. 1955] can be formalized and extended by our new type of computationmodel – Morphic Computing. In holistic systems (biological, chemical, social, eco-nomic, mental, linguistic, etc.) one cannot determined or explained the propertiesof the system by the sum of its component parts alone; “The whole is more thanthe sum of its parts” (Aristotle). In this paper, we introduce the basis for our newcomputing paradigm- Morphic Computing.

2 Morphic Computing and Field Theory: Classical and ModernApproach

2.1 Fields

In this paper, we assume that computing is not always related to symbolic entities asnumbers, words or other symbolic entities. Fields as entities are more complex thanany symbolic representation of the knowledge. For example, Morphic Fields includethe universal database for both organic (living) and abstract (mental) forms. In clas-sical physics, we represent the interaction among the particles by local forces thatare the cause of the movement of the particles. Also in classical physics, it is moreimportant to know at any moment the individual values of the forces than the struc-ture of the forces. This approach considers that the particles are independent fromthe other particles under the effect of the external forces. But with further devel-opment of the theory of particle physics, the researchers discovered that forces areproduced by intermediate entities that are not located in one particular point of spacebut are at any point of a specific space at the same time. These entities are called“Fields”. Based on this new theory, the structure of the fields is more important thanthe value itself at any point. In this representation of the universe, any particle in anyposition is under the effect of the fields. Therefore, the fields are used to connect allthe particles of the universe in one global entity. However, if any particle is underthe effect of the other particles, every local invariant property will disappear becauseevery system is open and it is not possible to close any local system. To solve thisinvariance problem, scientist discovered that the local invariant can be conserved

Page 164: Forging New Frontiers: Fuzzy Pioneers I

158 G. Resconi, M. Nikravesh

with a deformation of the local geometry and the metric of the space. While theform of the invariant does not change for the field, the action is changed. However,these changes are only in reference to how we write the invariance. In conclusion,we can assume that the action of the fields can be substituted with deformationof the space. Any particle is not under the action of the fields, the invariance asenergy, momentum, etc. is true (physical symmetry). However, the references thatwe have chosen change in space and time and in a way to simulate the action ofthe field. In this case, all the reference space has been changed and the action ofthe field is only a virtual phenomena. In this case, we have a different referencespace whose geometry in general is non Euclidean. With the quantum phenomena,the problem becomes more complex because the particles are correlated one withone other in a more hidden way without any physical interaction with fields. Thiscorrelation or entanglement generates a structure inside the universe for which theprobability to detect a particle is a virtual or conceptual field that covers the entireUniverse.

2.2 Morphic Computing: Basis for Quantum, DNA,and Soft Computing

Gabor [1972] and H. Fatmi and Resconi [1988] discovered the possibility of com-puting images made by a huge number of points as output from objects as a setof huge number of points as input by reference beams or laser (holography). It isalso known that a set of particles can have a huge number of possible states thatin classical physics, separate one from the other. However, only one state (positionand velocity of the particles) would be possible at a time. With respect to quantummechanics, one can have a superposition of all states with all states presented inthe superposition at the same time. It is also very important to note that at thesame time one cannot separate the states as individual entities but consider themas one entity. For this very peculiar property of quantum mechanics, one can changeall the superpose states at the same time. This type of global computation is theconceptual principle by which we think one can build quantum computers. Similarphenomena can be used to develop DNA computation where a huge number ofDNA as a field of DNA elements are transformed (replication) at the same time andfiltered (selection) to solve non polynomial problems. In addition, soft-computing orcomputation by words extend the classical local definition of true and false value forlogic predicate to a field of degree of true and false inside the space of all possiblevalues of the predicates. In this way, the computational power of soft computing isextended similar to that which one can find in quantum computing, DNA comput-ing, and Holographic computing. In conclusion, one can expect that all the previousapproaches and models of computing are examples of a more general computa-tion model called “Morphic Computing” where “Morphic” means “form” and isassociated with the idea of holism, geometry, field , superposition, globality andso on.

Page 165: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 159

2.3 Morphic Computing and Conceptual Fields – Non PhysicalFields

Morphic Computing change or compute non physical conceptual fields. One exam-ple is in representing the semantics of words. In this case, a field is generated by aword or a sentence as sources. For example, in a library the reference space wouldbe where the documents are located. At any given word, we define the field as a mapof the position of the documents in the library and the number of the occurrences(values) of the word in the document. The word or source is located in one point ofthe reference space (query) but the field (answer) can be located in any part of thereference.

Complex strings of words (structured query) generate a complex field or complexanswer by which the structure can be obtained by the superposition of the fields ofthe words as sources with different intensity. Any field is a vector in the space of thedocuments. A set of basic fields is a vector space and form a concept. We break thetraditional idea that a concept is one word in the conceptual map. Internal structure(entanglement) of the concept is the relation of dependence among the basic fields.The ambiguous word is the source (query) of the fuzzy set (field or answer).

2.4 Morphic Computing and Natural Languages – Theoryof Generalized Constraint

In a particular case, we know that a key assumption in computing with words isthat the information which is conveyed by a proposition expressed in a naturallanguage or word may be represented as a generalized constraint of the form “Xisr R”, where X is a constrained variable; R is a constraining relation; and r is anindexing variable whose value defines the way in which R constrains X. Thus, ifp is a proposition expressed in a natural language, then “X isr R” representing themeaning of p, equivalently, the information conveyed by p. Therefore, the gener-alised constraint model can be represented by field theory in this way. The meaningof any natural proposition p is given by the space X of the fields that form a conceptin the reference space or objective space, and by a field R in the same reference.We note that a concept is not only a word, but is a domain or context X where thepropositions p represented by the field R is located. The word in the new image isnot a passive entity but is an active entity. In fact, the word is the source of the field.We can also use the idea that the word as an abstract entity is a query and the fieldas set of instances of the query is the answer.

2.5 Morphogenetic and Neural Network

A neural network is a complex structure that connects simple entities denoted asneurons. The main feature of the neural network is the continuous evolution in

Page 166: Forging New Frontiers: Fuzzy Pioneers I

160 G. Resconi, M. Nikravesh

complexity of the interconnections. This evolutionary process is called morpho-genesis. Besides the evolution in the structure, we also have an evolution in thebiochemical network inside the neurons and the relations among neurons. The bio-chemical morphogenesis is useful for adapting the neurons to the desired function-ality. Finally, in a deeper form, the biochemical and structural morphogenesis isunder the control of the gene network morphogenesis. In fact, any gene will controlthe activity of the other genes through a continuous adaptive network. Morphogen-esis is essential to brain plasticity so as to obtain homeostasis or the invariant ofthe fundamental and vital functionality of the brain. The predefined vital functionsinteract with the neural network in the brain which activity is oriented to the imple-mentation or projection of the vital function into the neural activities. In conclusion,morphogenetic activity is oriented to compensate for the difference between thevital functions and the neural reply. The field nature of the designed functionalityas input and of the implemented functionality in the neural network as output sug-gest the holistic nature of the neural network activity. The neural network with itsmorphogenesis can be considered as a prototype of morphic computing.

2.6 Morphic Systems and Morphic System of Systems (M-SOS)

We know that System of System or SoS movement study large scale systems in-tegration. Traditional systems engineering is based on the assumption that if giventhe requirements the engineer will give you the system. The emerging system ofsystems context arises when a need or set of needs are met with a mix of multiplesystems, each of which are capable of independent operation in order to fulfil theglobal mission or missions.

For example, design optimisation strategies focus on minimizing or maximizingan objective while meeting several constraints. These objectives and constraints typ-ically characterize the performance of the individual system for a typical design mis-sion or missions. However, these design strategies rarely address the impact on theperformance of a larger system of systems, nor do they usually address the dynamic,evolving environment in which the system of systems must act. A great body? ofwork exists that can address “organizing” a system of systems from existing singlesystems: resource allocation is currently used in any number of fields of engineeringand business to improve the profit,? other systems optimisation. One reason for thisnew emphasis on large-scale systems is that customers want solutions to providea set of capabilities, not a single specific vehicle or system to meet an exact setof specifications. Systems engineering is based on the assumption that if given therequirements the engineer will give you the system “There is growing recognitionthat one does not necessarily attack design problems solely at the component level.”The cardinal point for SoS studies is that the solutions are unlikely to be foundin any one field There is an additional consideration: A fundamental characteristicof the problem areas where we detect SoS phenomena will be that they are opensystems without fixed and stable boundaries and adapt to changing circumstances.In SoS there is common language and shared goals, communications helps form

Page 167: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 161

communities around a confluence of issues, goals, and actions. Therefore, if wewish to increase the dimensionality we need tools not to replace human reasoningbut to assist and support this type of associative reasoning. Important effects oc-cur at multiple scales, involving not only multiple phenomena but phenomena ofvery different types. These in turn, are intimately bound to diverse communitiesof effected individuals. Human activity now yields effects on a global scale. Yetwe lack the tools to understand the implications of our choices. Given a set ofprototype systems, integrations (superposition) of these systems generate a familyof meta-systems. Given the requirement X the MS (Morphic System) reshape ina minimum way or zero the X. The reshaped requirements belong to the familyof meta-systems and can be obtained by an integration of the prototype systems.In conclusion, the MS is the instrument to implement the SoS. To use MS werepresent any individual system as a field the superposition with the field of theother systems is the integration process. This is similar to the quantum computerwhere any particle (system) in quantum physics is a field of uncertainty (density ofprobability). The integration among particles is obtained by a superposition of thedifferent uncertainties. The meta-system or SoS is the result of the integration. Anyinstrument in quantum measure has only a few possible numbers of integration. Sothe quantum measure reshape the given requirement or physical integration in a wayto be one of the possible integrations in the instrument. So the quantum instrumentis a particular case of the SoS and of the MS. In MS we also give suitable geometricstructure (space of the fields of the prototype systems) to represent integration inSoS and correlation among the prototype systems. With this geometry we give an?invariance property for transformation of the prototype systems and the requirementX. So we adjoin to the SoS integration a dynamical process which dynamical lawis the invariant. Since the geometry is in general a non Euclidean geometry, we candefine for the SoS a morpho field (M F) which action is to give the deformationof the geometry from Eucliden (independent prototype system) to a non Euclidean(dependent prototype system). The Morpic Field is comparable to the gravity fieldin general relativity. So the model of quantum mechanics and general relativity area special case of MS and are also the physical model for the SoS.

2.7 Morphic Computing and Agents – Non Classical Logic

In the agent image, where only one word (query) as a source is used for any agent,the field generated by the word (answer) is a Boolean field (the values in any pointsare true or false). Therefore, we can compose the words by logic operations to createcomplex Boolean expression or complex Boolean query. This query generates aBoolean field for any agent. This set of agents creates a set of elementary Booleanfields whose superposition is the fuzzy set represented by a field with fuzzy values.The field is the answer to the ambiguous structured query whose source is the com-plex expression p. The fields with fuzzy values for complex logic expression arecoherent with traditional fuzzy logic with a more conceptual transparency becauseit is found on agents and Boolean logic structure. As points out [Nikravesh, 2006]

Page 168: Forging New Frontiers: Fuzzy Pioneers I

162 G. Resconi, M. Nikravesh

the Web is a large unstructured and in many cases conflicting set of data. So inthe World Wide Web, fuzzy logic and fuzzy sets are essential parts of a query andalso for finding appropriate searches to obtain the relevant answer. For the agentinterpretation of the fuzzy set, the net of the Web is structured as a set of conflict-ing and in many case irrational agents whose task is to create any concept. Agentsproduce actions to create answers for ambiguous words in the Web. A structuredquery in RDF can be represented as a graph of three elementary concepts as subject,predicate and complement in a conceptual map. Every word and relationship in theconceptual map are variables whose values are fields which their superposition givesthe answer to the query. Because we are more interested in the meaning of the querythan how we write the query itself, we are more interested in the field than howwe produce the field by the query. In fact, different linguistic representations of thequery can give the same field or answer.

In the construction of the query, we use words as sources of fields with differentintensities. With superposition, we obtain the answer for our structured query. Westructure the text or query to build the described field or meaning. It is also possibleto use the answer, as a field, to generate the intensity of the words as sources inside astructured query. The first process is denoted READ process by which we can readthe answer (meaning) of the structured query. The second process is the WRITEprocess by which we give the intensity or rank of the words in a query when weknow the answer. As an analogy to holography, the WRITE process is the construc-tion of the hologram when we know the light field of the object. The READ is theconstruction of the light field image by the hologram. In the holography, the READprocess uses a beam of coherent light as a laser to obtain the image. Now in ourstructured query, the words inside of text are activated at the same time. The wordsas sources are coherent in the construction by superposition of the desired answeror field. Now the field image of the computation by words in a crisp and fuzzy in-terpretation prepares the implementation of the Morphic Computing approach to thecomputation by words. In this way, we have presented an example of the meaningof the new type of computation, “Morphic Computing”.

2.8 Morphic Computing: Basic Concepts

Morphic Computing is based on the following concepts:

1) The concept of field in the reference space2) The fields as points or vectors in the N dimension Euclidean space of the objects

(points)3) A set of M ≤ N basis fields in the N dimensional space. The set of M fields are

vectors in the N dimensional space. The set of M vectors form a non Euclideansubspace H (context) of the space N. The coordinates Sα in M of the field X arethe contro-variant components of the field X. The components of X in M are alsothe intensity of the sources of the basis field. The superposition of the basis fieldwith different intensity give us the projection Q of X or Y = QX into the space

Page 169: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 163

H When M < N the projection operator of X into H define a constrain or relationamong the components of Y.

4) With the tensor calculus with the components Sα of the vector X or the com-ponents of more complex entity as tensors , we can generate invariants for anyunitary transformation of the object space or the change of the basis fields.

5) Given two projection operators Q1 , Q2 on two spaces H1 , H2 with dimensionM1 and M2 we can generate the M = M1 M2 , with the product of Y1 and Y2 orY = Y1 Y2 . Any projection Q into the space H or Y = QX of the product of thebasis fields generate Y. When Y �= Y1 Y2 the output Y is in entanglement stateand cannot separate in the two projections Q1 and Q2.

6) The logic of the Morphic Computing Entity is the logic of the projection opera-tors that is isomorphic to the quantum logic

The information can be coded inside the basis fields by the relation among thebasis fields. In Morphic Computing, the relation is represented by a non Euclideangeometry which metric or expression of the distance between two points shows thisrelation.

The projection operator is similar to the measure in quantum mechanics. Theprojection operator can introduce constrains inside the components of Y.

The sources are the instrument to control the image Y in Morphic Computing.There is a strong analogy between Morphic Computing and computation by holog-raphy and computation by secondary sources (Jessel 1982) in the physical field.

The computation of Y by X and the projection operator Q that project X into thespace H give his result when the Y is similar to X. In this case, the sources S arethe solution of the computation. We see the analogy with the neural network wherethe solution is to find the weights wk at the synapse. In this paper, we show that theweights are sources in Morphic Computing.

Now, it is possible to compose different projection operators in a network ofMorphic Systems. It is obvious to consider this system as a System of Systems.

Any Morphic Computation is always context dependent where the context is H.The context H by the operator Q define a set of rules that are comparable with therules implemented in a digital computer. So when we change the context with thesame operations, we obtain different results. We can control the context in a wayto obtain wanted results. When any projection operator of X or QX is denoted asa measure, in analogy with quantum mechanics, any projection operator dependson the previous projection operator. In the measure analogy, any measure dependson the previous measures. So any measure is dependent on the path of measures orprojection operators that we realise before or through the history. So we can say thatdifferent projection operators are a story (See Roland Omnès [1994] in quantummechanics stories).

The analogy of the measure also gives us another intuitive idea of Morph Com-puting. Any measure become a good measure when gives us an image Y of thereal phenomena X that is similar, when the internal rules to X are not destroyed.In the measure process, the measure is a good measure. The same for MorphicComputing, the computation is a good computation when the projection operatordoes not destroy the internal relation of the field in input X.

Page 170: Forging New Frontiers: Fuzzy Pioneers I

164 G. Resconi, M. Nikravesh

The analogy with the measure in quantum mechanics is also useful to explain theconcept of Morphic Computing because the instrument in the quantum measure isthe fundamental context that interferes with the physical phenomena as H interfereswith the input field X.

A deeper connection exists between the Projection operator lattice that representsthe quantum logic and Morphic Computing processes (see Eddie Oshins 1992).

Because any fuzzy set is a scalar field of the membership values on the factors(reference space) (Wang and Sugeno, 1982). We remember that any concept can beviewed as a fuzzy set in the factor space. So at the fuzzy set we can introduce all theprocesses and concepts that we utilise in Morphic Computing.

For the relation between concept and field, we introduce in the field theory anintrinsic fuzzy logic. So in Morphic Computing, we have an external logic of theprojection or measure (quantum logic) and a possible internal fuzzy logic of thefuzzy interpretation of the fields.

In the end, because we also use agents’ superposition to define fuzzy sets andfuzzy rules, we can again use Morphic Computing to compute the agents inconsis-tency and irrationality. Thus, fuzzy set and fuzzy logic are part of the more generalcomputation denoted as Morphic Computing.

3 Reference Space, Space of the objects, Space of the fieldsin the Morphic Computing

Given the n dimensional reference space (R1, R2, . . . , Rn), any point P = (R1, R2,

. . . , Rn) is an object.Now we create the space of the objects which dimension is equal to the number

of the points and the value of the coordinates in this space is equal to the value ofthe field in the point. We call the space of the points “space of the objects”.

Any field connect all the points in the reference space and is represented as apoint in the object space. The components of this vector are the value of the fieldin the different points. We know that each of the two points connected by a linkassume the value of one of the connections. All the other points assume zero value.Now any value of the field in a point can be considered as a degree of connection ofthis point with all the others. Therefore, in one point where the field is zero, we canconsider this point as non-connected to the others. In fact, because the field in thispoint is zero the other points cannot be connected by the field to the given point. Inconclusion, we consider the field as a global connector of the objects in the referencespace. Now inside the space of the objects, we can locate any type of field as vectorsor points. In field theory, we assume that any complex field can be considered as asuperposition of prototype fields whose model is well known.

The prototype fields are vectors in the space of the objects that form a newreference or field space. In general, the field space is a non Euclidean space. Inconclusion, any complex field Y can be written in this way

Page 171: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 165

Y = S1 H1(R1, . . . , Rn)+ S2 H2(R1, . . . , Rn)+ . . .+Sn Hn(R1, . . . , Rn) = H(R) S (1)

In (1), H1, H2, . . . , Hn are the basic fields or prototype fields and S1, S2, . . . , Snare the weights or source values of the basic fields. We assume that any basic fieldis generated by a source. The intensity of the prototype fields is proportional to theintensity of the sources that generates the field itself.

3.1 Example of the Basic Field and Sources

In Fig. 1, we show an example of two different basic fields in a two dimensionalreference space (x, y). The general equation of the fields is

F(x, y) = S[e−h((x−x0)2+(y−y0)

2)] (2)

the parameters of the field F1 are S = 1 h = 2 and x0 = −0.5 and y0 = −0.5,the parameters of the field F2 are S = 1 h = 2 and x0 = 0.5 and y0 = 0.5

For the sources S1 = 1 and S2 = 1 the superposition field F that is shown inFig. 2 is F = F1 + F2. For the sources S1 = 1 and S2 = 2, the superposition field Fthat is shown again in Fig. 2 is F = F1 + 2 F2 .

3.2 Computation of the Sources

To compute the sources Sk, we represent the prototype field Hk and the input fieldX in a Table 1 where the objects are the points and the attribute are the fields.

The values in Table 1 is represented by the following matrices

FF

F2F1

Fig. 1 Two different basic fields in the two dimensional reference space (x,y)

Page 172: Forging New Frontiers: Fuzzy Pioneers I

166 G. Resconi, M. Nikravesh

F = F1 + F2. F = F1 + 2F2.

Fig. 2 Example of superposition of elementary fields F1 , F2

Table 1 Fields values for M points in the reference space

H1 H2 . . . HN Input Field X

P1 H1,1 F1,2 . . . H1,N X1P2 H2,1 H2,2 . . . H2,N X2. . . . . . . . . . . . . . . . . .

PM HM,1 HM,2 . . . HM,N XM

H =

⎡⎢⎢⎣

H1,1 H1,2 . . . H1,N

H2,1 H2,2 . . . H2,N

. . . . . . . . . . . .

HM,1 HM,2 . . . HM,N

⎤⎥⎥⎦ , X =

⎡⎢⎢⎣

X1X2. . .

X M

⎤⎥⎥⎦

The matrix H is the relation among the prototype fields Fk and the points Ph. Atthis point, we are interested in the computation of the sources S by which they givethe best linear model of X by the elementary field values. Therefore, we have thesuperposition expression

Y = S1

⎡⎢⎢⎣

H1,1H2,1. . .

HM,1

⎤⎥⎥⎦+ S2

⎡⎢⎢⎣

H1,2H2,2. . .

HM,2

⎤⎥⎥⎦+ . . .+ Sn

⎡⎢⎢⎣

H1,n

H2,n

. . .

HM,n

⎤⎥⎥⎦ = H S (3)

Then, we compute the best sources S in a way the difference |Y−X | is the minimumdistance for any possible choice of the set of sources. It is easy to show that the bestsources are obtained by the expression

S = (H T H )−1 H T X (4)

Given the previous discussion and field presentation, the elementary Morphic Com-puting element is given by the input-output system as shown in Fig. 3.

Figure 4 shows network of elementary Morphic Computing with three set of

Page 173: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 167

SourcesS = (HTH)-1 HT X

Field X Field Y = H S = QX

Prototypefields H(R)

Fig. 3 Elementary Morphic Computing

prototype fields and three type of sources with one general field X in input and onegeneral field Y in output and intermediary fields from X and Y.

When H is a square matrix, we have Y = X and

S = H−1 X and Y = X = H S (5)

Now for any elementary computation in the Morphic Computing, we have thefollowing three fundamental spaces.

1) The reference space2) The space of the objects (points)3) The space of the prototype fields

Figure 5 shows a very simple geometric example when the number of the objectsare three (P1, P2, P3) and the number of the prototype fields are two (H1, H2). Thespace which coordinates are the two fields is the space of the fields.

Please note that the output Y = H S is the projection of X into the space H

Y = H(HT H)−1 HT X = Q X

With the property Q2 X = Q XTherefore, the input X can be separated in two parts

X = Q X+ F

where the vector F is perpendicular to the space H as we can see in a simple examplegiven in Fig. 6.

Fig. 4 Shows the Network ofMorphic Computing

S2 S3

S1

H1

H2 H3

X Y

Page 174: Forging New Frontiers: Fuzzy Pioneers I

168 G. Resconi, M. Nikravesh

Fig. 5 The fields H1 and H2are the space of the fields.The coordinates of thevectors H1 and H2 are thevalues of the fields in thethree points P1, P2, P3 P1

P2

P3

H2

H1

Now, we try to extend the expression of the sources in the following wayGiven G (�) = �T � and G (H) = HT H andS∗ = [G (H)+ G (�)]−1 HT X and[G(H)+ G (�)]S∗ = HT XSo for S∗ = (HT H)−1 HT X+�α = Sα +�α

we have([G (H)+ G (�)](Sα +�α) = HT X(HT H)(HT H)−1 HT X+ ([G (H)+ G (�)]�α = HT XG (�) Sα + [G (H)+ G (�)] �α = 0andS∗ = S +� = (H T H + �T �

)−1H T X

where for � is function of S by the equationG (�) S+ [G (H)+ G (�)] � = 0For non-square matrix and/or singular matrix, we can use the generalized modelgiven by Nikravesh [] as follows;

S∗ =(

H T �T � H)−1

H T �T � X =((�H )T (� H )

)−1(�H )T � X

Where we transform by Λ the input and the references H.The value of the variable D (metric of the space of the field) is computed by the

expression (6)

D2 = (H S)T(H S) = ST HT H S = ST G S = (Q X)T Q X (6)

For the unitary transformation U for which, we have UT U = I and H’ = U H theprototype fields change in the following way

Fig. 6 Projection operator Qand output of the elementaryMorphic Computing. We seethat X = Q X+ F , where thesum is the vector sum

P1

P2

P3

F2

F1

X

QX = Y

F

Page 175: Forging New Frontiers: Fuzzy Pioneers I

Morphic Computing: Concept and Foundation 169

H’ = U H

G’ = (U H)T(U H) = HT UT U H = HT H

And

S’ = [(U H)T(U H)]−1(U H)T Z = G−1 HT UT Z = G−1 HT (U−1 Z)

For Z = U X we have S’ = S andthe variable D is invariant. for the unitary transformation U.We remark that G = HT H is a quadratic matrix that gives the metric tensor of

the space of the fields. When G is a diagonal matrix the entire elementary field areindependent one from the other. But when G has non diagonal elements, in this casethe elementary fields are dependent on from the other. Among the elementary fieldsthere is a correlation or a relationship and the geometry of the space of the fields isa non Euclidean geometry.

References

1. L.M. Adleman (1994) “Molecular Computation Of Solutions To Combinatorial Problems”.Science (journal) 266 (11): 1021–1024 (1994).

2. H. Bergson (1896), Matter and Memory 1896. (Matière et mémoire) Zone Books 1990: ISBN0-942299-05-1 (published in English in 1911).

3. D. Deutsch (1985). “Quantum Theory, the Church-Turing Principle, and the Universal Quan-tum Computer”. Proc. Roy. Soc. Lond. A400, 97–117.

4. R.P. Feynman (1982). “Simulating Physics with Computers”. International Journal of Theo-retical Physics 21: 467–488.

5. H.A. Fatmi and G.Resconi (1988), A New Computing Principle, Il Nuovo Cimento Vol.101B,N.2 - Febbraio 1988 - pp.239–242

6. D. Gabor (1972), Holography 1948–1971, Proc.IEEE Vol. 60 pp 655–668 June 1972.7. M. Jessel, Acoustique Théorique, Masson et Cie Editours 19738. V. Kotov (1997), “Systems-of-Systems as Communicating Structures,” Hewlett Packard

Computer Systems Laboratory Paper HPL-97-124, (1997), pp. 1–15.9. J. McCarthy, M.L. Minsky, N. Rochester, and C.E. Shannon (1955), Proposal for Dartmouth

Summer Research Project on Artificial Intelligence, August 31, 1955.10. M. Nielsen and I. Chuang (2000). Quantum Computation and Quantum Information. Cam-

bridge: Cambridge University Press. ISBN 0-521-63503-9.11. M. Nikravesh, Intelligent Computing Techniques for Complex systems, in Soft Computing

and Intelligent Data Analysis in Oil Exploration, pp. 651–672, Elsevier, 2003.12. R. Omnès (1994), The Interpretation of Quantum Mechanics , Princeton Series in Physics,

1994.13. E. Oshins, K.M. Ford, R.V. Rodriguez and F.D. Anger (1992). A comparative analysis: classi-

cal, fuzzy, and quantum logic. Presented at 2nd Florida Artificial Intelligence Research Sym-posium, St. Petersburg, Florida, April 5, 1989. In Fishman, M.B. (Ed.), Advances in artificialintelligence research, Vol. II, Greenwich, CT: JAI Press. Most Innovative Paper Award, 1989,FLAIRS-89, Florida AI Research Symposium.

14. G. Resconi and M. Nikravesh, Morphic Computing, to be published in Forging New Frontiers,2007

15. G. Resconi and M. Nikravesh, Morphic Computing, Book in Preparation, 2007

Page 176: Forging New Frontiers: Fuzzy Pioneers I

170 G. Resconi, M. Nikravesh

16. G. Resconi and M. Nikravesh, Field Theory and Computing with Words, , FLINS 2006, 7thInternational FLINS Conference on Applied Artificial Intelligence, August 29–31, 2006 Italy

17. G. Resconi and L.C. Jain (2004), Intelligent Agents, Springer, 200418. R. Sheldrake (1981), A New Science of Life: The Hypothesis of Morphic Resonance (1981,

second edition 1985), Park Street Press; Reprint edition (March 1, 1995)19. R. Sheldrake (1988), The Presence of the Past: Morphic Resonance and the Habits of Nature

(1988), Park Street Press; Reprint edition (March 1, 1995)20. J.C. Smuts (1926),Holism and Evolution, 1926 MacMillan, Compass/Viking Press 1961

reprint: ISBN 0-598-63750-8, Greenwood Press 1973 reprint: ISBN 0-8371-6556-3, SierraSunrise 1999 (mildly edited): ISBN 1-887263-14-4

21. A.M. Turing (1936–7), ‘On computable numbers, with an application to the Entschei-dungsproblem’, Proc. London Maths. Soc., ser. 2, 42: 230–265; also in (Davis 1965) and(Gandy and Yates 2001)

22. A.M. Turing (1950), ‘Computing machinery and intelligence’, Mind 50: 433–460; also in(Boden 1990), (Ince 1992)

23. P.Z. Wang and Sugeno M. (1982) , The factor fields and background structure for fuzzy sub-sets. Fuzzy Mathematics , Vol. 2 , pp.45–54

24. L.A. Zadeh, Fuzzy Logic, Neural Networks and Soft Computing, Communications of theACM, 37(3):77–84, 1994.

25. L.A. Zadeh and M. Nikravesh (2002), Perception-Based Intelligent Decision Systems; Officeof Naval Research, Summer 2002 Program Review, Covel Commons, University of Califor-nia, Los Angeles, July 30th-August 1st, 2002.

26. L.A. Zadeh, and J. Kacprzyk, (eds.) (1999a), Computing With Words in Informa-tion/Intelligent Systems 1: Foundations, Physica-Verlag, Germany, 1999a.

27. L.A. Zadeh, L. and J. Kacprzyk (eds.) (1999b), Computing With Words in Informa-tion/Intelligent Systems 2: Applications, Physica-Verlag, Germany, 1999b.

28. L.A. Zadeh, Fuzzy sets, Inf. Control 8, 338–353, 1965.

Page 177: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Frameworkfor Modeling and Processing Fuzzy Sets

Rui J. P. de Figueiredo

Abstract We present a nonlinear functional analytic framework for modeling andprocessing fuzzy sets in terms of their membership functions. Let X = {x} denotea universe of discourse, and A, a fuzzy set of elements x in X and membershipfunction μA. First, we formally introduce a class C = { A} of attributes A, and ajudgment criterion J in the definition of μA; and explain the role of such anattribute- and judgment-based membership function μA( A, J, x) in the interpreta-tion of A as a value-based or uncertainty-based fuzzy set. Second, for uncertainty-based fuzzy sets, we associate with each attribute A (e.g., old), a correspondingevent, also denoted by A (e.g., event of being old), the set C of all such eventsconstituting a completely additive class in an appropriate probability space, x beinga random variable, vector, or object induced by this space. On such a basis, wepresent and discuss the role of the membership function μA as a generalization ofthe concept of posterior probability P( A/x). This allows us to introduce rigorously,on a single platform, both human and machine judgment J in assigning objects tofuzzy sets by minimizing conditional risk. Third, we assume that X is a vector spaceendowed with a scalar product, such as a Euclidian Space or a separable HilbertSpace. Typically, X = {x} would be a feature space, its elements x = x(p) beingfeature vectors associated with objects p of interest in the discourse. Then, mem-bership functions become membership “functionals”, i.e., mappings from a vectorspace to the real line. Fourth, with this motivation, we focus attention on the class Φ

of fuzzy sets A whose membership functions μA are analytic (nonlinear) functionalson X , which, can therefore be represented by abstract power series in the elementsx of X . Specifically, μA in Φ are assumed to lie in a positive cone Λ of vectorsin a Generalized Fock Space F(X). This F(X) is a Reproducing Kernel HilbertSpace (RKHS) of analytic functionals on X introduced by the author and T. A. W.Dwyer in 1980 for nonlinear signals and systems analysis. Fifth, in such a setting,we view the μA as vectors in F(X) acting as “representers” of their respective fuzzysets A. Thus, because of the one-to-one relationship between the fuzzy sets A and

Rui J. P. de FigueiredoLaboratory for Intelligent Signal Processing and Communications, California Institute forTelecommunications and Information Technology, University of California Irvine, Irvine, CA,92697-2800

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 171C© Springer 2007

Page 178: Forging New Frontiers: Fuzzy Pioneers I

172 R. J.P. de Figueiredo

their respective μA, the fuzzy sets A can benefit from all the results of analyticalprocessing of their μA as vectors. Finally, we derive a “best model” A for a fuzzy setA based on a best approximation μA of its membership functional μA in the spaceF(X), subject to appropriate interpolating or smoothing training data constraintsand a positivity constraint. The closed form solution μA thus obtained appears in theform of an artificial network, proposed by the author in another theoretical contextin 1990. For the purpose of illustration of the underlying technology, an applicationto the diagnosis of Alzheimer’s disease is briefly discussed.

1 Introduction

In 1965, L.A. Zadeh generalized the concept of an ordinary set to that of a fuzzyset [1]. In the four decades that have elapsed since then, this conceptual general-ization has enabled the development of simple and yet powerful rules and method-ologies for processing data sets taking into account the uncertainty present in them(see, e.g., [2, 3, 4, 5]).

To further extend the capabilities of the above generalization, we present, in thischapter, a framework for modeling and processing fuzzy sets in terms of their mem-bership functions.

For this purpose, we formally introduce a class C = {A} of attributes A, anda judgment criterion J in the definition of μA, and explain the role of such aμA( A, J, x) in the interpretation of a set A consisting of elements x of a universalset X as a value-based or uncertainty-based fuzzy set.

For uncertainty-based fuzzy sets we assume that there is a one-to-one relationshipbetween an attribute A and a corresponding event, denoted by the same symbol,in an appropriate probability space. In such a setting, we present and discuss therole of the membership function μA as a generalization of the concept of posteriorprobability P( A/x). This allows us to introduce rigorously, on a single platform,both human and machine judgment J in minimizing the conditional risk for thepurpose of linking x with the attribute A characterizing the fuzzy set A.

Then, under the conditions to be stated in a moment, we allow the membershipfunctions μA to be analytic (nonlinear) functionals, from X to the real line, belong-ing to a Reproducing Kernel Hilbert Space F(X). We may think of them as “rep-resenters” of their respective fuzzy sets A. In such a setting, we pose the problemof finding the “best model” for A as a constrained optimization problem of findingthe best approximation μA for their corresponding μA in F(X), under training dataconstraints. In this way, the benefits of the vector-based analytical processing of μA

is passed on to the fuzzy sets A that they represent.Specifically, we consider, in this chapter, a class � of fuzzy sets A, whose μA

satisfy the following three conditions.

1. The universe of discourse X is a vector space endowed with a scalar product.Thus X may be a (possibly weighted) Euclidian Space or a separable HilbertSpace. Typically, the elements x = x(p) of X are feature vectors associated withobjects p of interest in the discourse. Then, the membership functions μA(.) :

Page 179: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 173

X → [0, 1] generalize into membership “functionals”, i.e., mappings from avector space to the real line. These functionals are, in general, nonlinear in;

2. The μA are analytic, and hence representable by abstract power series in theelements x of X .

3. The μA belong to a positive cone � of a Reproducing Kernel Hilbert SpaceF(X) of analytic functionals on the Hilbert Space X . The space F(X) was intro-duced in 1980 by de Figueiredo and Dwyer [6, 7] for nonlinear functional anal-ysis of signals and systems. F(X) constitutes a generalization of the SymmetricFock Space, the state space of non-self-interacting Boson fields in quantum fieldtheory.

In the next section, we discuss interpretations of μA based on an attribute variableA (e.g., “old”, “rich”, “educated”, “hot”. . . , or any event of interest) and a judg-ment criterion J , and, in particular, focus on the role of characterization of μA as ageneralization of a posterior probability.

This provides the motivation and basis for the material presented in the subse-quent sections. In particular, as we have just indicated, under the above three condi-tions, we derive a “best model” A for a fuzzy set A based on a best approximationμA of its membership functional μA in the space F(X).

Let {vi , ui }, i=1, . . . , q, denote a set of training pairs, where, for i=1, . . . , q, ui

is the value μA(vi ) to be assigned by an intelligent machine or a human expertto a prototype feature vector vi assumed to belong to A. Then μA is obtainedby minimizing the maximum error between μA and all other μA in an uncer-tainty set � in F(X), subject to a positivity constraint and interpolating constraintsμ A(vi )=ui , i=1, . . . , q . This set � may be viewed as a fuzzy set of fuzzy set repre-sentatives. It turns out that the closed form solution to the μA thus obtained appearsnaturally in the form of an artificial neural network introduced by us, in a gen-eral theoretical setting in 1990 and called an Optimal Interpolative Neural NetworkOINN [8, 9]. In conclusion, an example of application to medical diagnosis is given.

2 Interpretation of Fuzzy Sets Via Their MembershipFunctionals

2.1 What is a Fuzzy Set?

Let us recall, at the outset, using our own clarifying notation, the following:

Definition 1. Let X = {x} denote a universal collection of objects x constitutingthe universe of discourse. Then A is called a fuzzy set of objects x in X if and onlyif the following conditions are satisfied:

(a) There is an attribute, or an event characterized by the respective attribute, A,and a judgment criterion J , on the basis of which the membership of x in A isjudged.

Page 180: Forging New Frontiers: Fuzzy Pioneers I

174 R. J.P. de Figueiredo

(b) For a given A and J , there is a membership functional μA( A, J, .) : X → [0, 1],whose value at x expresses, on a scale of zero to one, the extent to which xbelongs to A.

Under these conditions, the fuzzy set A is defined as

A = Support{μA}= {x ∈ X : μA( A, J, x) �= 0} (1)

From now on, for simplicity in notation, we will drop A and J in the arguments ofμA unless, for the sake of clarity, they need to be invoked. � 1

Remark 1. Let B denote an ordinary subset (called “crisp” subset) of X with thecharacteristic function ϕB : X → {0, 1} ⊂ [0, 1], i.e.,

ϕB(x) ={

1, x ∈ B0, otherwi se

(2)

It is clear from (1) and (2) that B is a fuzzy set with membership functional

μB(x) = ϕB(x), x ∈ X (3)

Thus a crisp subset constitutes a special case of a fuzzy set and all statements re-garding fuzzy sets hold for crisp sets or subsets, with the qualification expressed by(3). For this reason, fuzzy sets may well have been called “generalized sets.” �

Remark 2. According to condition (a) of Definition 1, there is a judgment J in-volved in the mathematical definition of a fuzzy set. This judgment may come froma human. For this reason, just as the conventional (crisp) set theory provides thefoundations for conventional mathematical analysis, fuzzy set theory can provide arigorous foundation for a more general mathematical analysis, which takes humanjudgment into account in the definitions of appropriate measure spaces, functions,functionals, and operators. This may explain why fuzzy set theory has led to “uncon-ventional” mathematical developments, like fuzzy logic and approximate reasoning,used in modeling human/machine systems. �

2.2 Interpretation of Membership Functionals

For a given universe of discourse X , the following are two of the possible importantinterpretations of fuzzy sets via their respective membership functions. They are:

1 Ends of formal statements will be signaled by �

Page 181: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 175

1. In the most general case (without the restriction that X be a vector space), afuzzy set A may be viewed as a value-based set for which the membership func-tional μA assigns a normalized value μA( A, J ; x) to an object x , based on anattribute A and judgment J . Note that the other interpretation listed below maybe considered to be value-based, where “value” is represented by the value of thedependent variable used in scoring the uncertainty in A.

2. As an uncertainty set, one may interpret the value of a membership functional asthe value of a functional analogous to the posterior probability of A given x .

In the following two sub-sections we discuss respectively these two interpretations.

2.2.1 Valued-Based Fuzzy Sets

This type of interpretation can best be explained by means of an example such asthe following.

Example 1. Value-Based Fuzzy Sets. Let the set of all persons p of ages between0 to 100 years generate the universe of discourse. Specifically for this problem,choose the “Age” x(p) of p as the feature vector belonging to the one-dimensionalreal Euclidian Space E1. Thus we have for the universe of discourse

X = {x ∈ E1 : 0 ≤ x ≤ 100}. (4)

We can construct various fuzzy sets using different attributes, say, an attribute A1such as “Old” or an attribute A2 such as “Rich”, as well as judgments from differententities, say a judgment J1 from a private sector organization or a judgment J2 froma public sector organization.

For the fuzzy sets “Old”, select, for example, the expression for the membershipfunction (where we use this term instead of functional because its argument is scalar)

g(λ, x) = [(1− exp(−100λ)]−1[1− exp(−λx)], 0 ≤ x ≤ 100 (5)

Then, assuming that judgments J1 and J2 correspond to some values λ1 and λ2 ofthe parameter λ, we can construct the fuzzy sets A1 and A2 of “Old” persons asfollows:

A1 = {x ∈ [0, 100] : μA1(x) = μA1( A1, J1, x) = g(λ1, x)} (6a)

A2 = {x ∈ [0, 100] : μA2(x) = μA2( A1, J2, x) = g(λ2, x)} (6b)

For the fuzzy sets “Rich”, we may select, for example, another expression for themembership function

h(λ, x) = [(1− exp(−104λ)]−1[1− exp(−λx2)], 0 ≤ x ≤ 100, (7)

and construct the fuzzy sets A3 and A4 of “Rich” persons

Page 182: Forging New Frontiers: Fuzzy Pioneers I

176 R. J.P. de Figueiredo

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

x(Age)

μ A(x

)

λ1=0.03

λ2=0.05

Fig. 1 Old

A3 = {x ∈ [0, 100] : μA3(x) = μA3( A2, J1, x) = h(λ1, x)} (8a)

A4 = {x ∈ [0, 100] : μA4(x) = μA4( A2, J2, x) = h(λ2, x)} (8b)

Figures 1 and 2 depict the shape of the membership function curves for the fuzzysets of old and rich persons for the nominal values of the parameters λ shown. �.

Remark 3. We glean from the above example the following important capabilitiesof fuzzy sets: In the modeling of fuzzy sets: (1) Attributes or events as well asjudgments from an intelligent machine and/or human can be taken into account; (2)It is not required to partition, at the outset, the universal set into subsets, becausedifferent fuzzy sets may coincide with one and another, and, in fact, with the entireuniverse of discourse. What distinguishes one fuzzy set from one another is the dif-ference in their membership functionals. This property, which is useful in modelingand processing the content of sets is absent in the analysis of ordinary sets. �

Example 2. Image as an “Object” x or a “Value-Based Fuzzy set” A. Consideran image f as an array of pixels

f = { f (u), u = (i, j) : i = 1, . . . , M, j = 1, . . . , N} (9)

In the context of fuzzy sets, f allows two interpretations:

(a) f is an object x = x( f ) belonging to a fuzzy set A of images;

(b) f is a fuzzy set of pixel locations u= (i, j), the gray level f (u) being the mem-bership value of the fuzzy set f at the pixel location u.

Page 183: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 177

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

λ3=0.0005

λ4=0.0008

x(Age)

μ A(x

)

Fig. 2 Rich

In many image-oriented applications, such as in medical imaging and SAR andHRR radar applications, interpretation (a) is used. Specifically, in interpretation (a),an image f is converted into a feature vector x = x( f ) and then the membershipfunctional pertaining to a target class A assigns to x a score value μA(x), between0 and 1, according to the judgment of an appropriately trained machine or expert.An example of this type of application to the diagnosis of Alzheimer’s disease isdescribed at the end of this chapter.

An application of interpretation (b) occurs in hyper-spectral image processing inremote sensing, where each sub-pixel is assigned, according to its spectral signature,a membership value w. r. t. different classes of materials, e.g., grass, sand, rock, . . . �

Example 3. Motion-Blurred Image. Consider the blurred image f shown in Fig. 3.It is a photo of a rapidly moving car taken by a still camera during a ChampCarchampionship race. The car is an MCI racing car that participated in the race. Theblur is not caused by random noise but by a motion operator applied to f. Let Adenote a class of racing cars. The membership of this car in A with a prescribeddegree of certainty can be assigned by an expert. If the image is de-blurred using ade-blurring operator G, one obtains a de-blurred version of f , namely g = G( f )

It will be then easier to classify g and therefore apply a higher value w. r. t. themembership class to which it correctly belongs. So modeling of the membershipfunctional based on uncertainty models that are not necessarily probabilistic can beof value in some applications, and, in fact the approach that we will be presentingallows this option. �

Page 184: Forging New Frontiers: Fuzzy Pioneers I

178 R. J.P. de Figueiredo

Fig. 3 Blurred image of an MCI racing car taken by a still camera while racing in a ChampCarChampionship Race. The entire image may be considered to be an object x in the universe of carimages X = {x} or a fuzzy set in the universe of 2-dimensional pixel locations as explained in thetext

2.2.2 Uncertainty-Based Fuzzy Sets

An uncertainty-based fuzzy set A is one for which the membership value μA(x)

represents the degree of certainty, in a scale from 0 to 1, with which an element x ofA has an attribute A or belongs to a category A. The assignment of the value can bemade based on machine-, and/or human-, and/or probabilistically -based judgment.

In a probabilistic setting, it is intuitively appealing to think that the human brainmakes decisions by minimizing the conditional risk rather than processing likeli-hood functions. For example, if one had the choice of investing a sum x on oneof several stocks, one would choose the stock which one would judge to have thehighest probability of success.

To state the above in formal terms, we recall that if A1, . . . , An are n attributes, orevents, or hypotheses, then the Bayes’ conditional risk, or cost of making a decisionin favor of Ai , is

C( Ai/x) = (1− P( Ai/x)) (10)

So the optimal (minimum risk) decision corresponds to assigning x to the Ai forwhich P( Ai/x) is maximum, i.e., such that

P( Ai/x) ≥ P( A j /x), j �= i, j = 1, . . . , n (11)

Page 185: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 179

So, on the above basis, we have the following proposition.

Proposition 1. Let A denote the attribute or event characterizing a generic fuzzy setA, based on a (objective or subjective) judgment J . Then, in a probabilistic setting,the membership function μA constitutes a generalization of the posterior probabilityexpressed as

μA( A, J ; x) = P( A|x), (12)

if the objects x in A are random.If x is a nonrandom parameter appearing in the expression of the probability of

A, then the above expression is replaced by

μA(α, J ; x) = P( A; x) (13)

In the case of (13), the posterior probability plays a role similar to that of the like-lihood function in the conventional estimation theory. �Remark 4. Relaxing Restrictions on the Membership Values of Fuzzy Set Com-binations. Fuzzy intersections, fuzzy unions, and fuzzy complements of fuzzy setsA and B defined by Zadeh [1] are equivalent to the following definitions:

A ∩ B = {x ∈ X : min[μA(x), μB(x)] �= 0} (14)

A ∪ B = {x ∈ X : max[μA(x), μB(x)] �= 0} (15)

AC = {x ∈ X : 1− μA(x) �= 0} (16)

There are several ways one may agree to assign membership values to these sets.One that is most common is

μA∩B(x) = min[μA(x), μB(x)] (17)

μA∪B (x) = max[μA(x), μB(x)] (18)

μAC (x) = 1− μA(x) (19)

Another set of assignments that leads to the same fuzzy sets defined in (14) – (16)is the one given by the following equations (20) to (22) below. The advantage of thelatter choice is that it satisfies the same rules as the posterior probabilities identifiedin (12) and (13). Thus the following rules enable machines and humans to use thesame framework in processing information.

μA∩B(x) = μA(x)μB(x) (20)

μA∪B (x) = μA(x)+ μB(x)− μA(x)μB(x) (21)

μAC (x) = 1− μA(x) (22)

Remark 5. Conditional Membership. More generally, following the analogy withprobability, one may define a conditional membership μA/B(x) in a straightforwardway, and generalize (20) and (21) by

Page 186: Forging New Frontiers: Fuzzy Pioneers I

180 R. J.P. de Figueiredo

μA∩B(x) = μA/B(x)μB(x) (23)

μA∪B(x) = μA(x)+ μB(x)− μA/B(x)μB(x). (24)

This system consists of n fuzzy sets A1, · · · , An with a Universe of Discourse X ={x1, · · · , xm}. The xi , A j , i = 1, · · · , m, j = 1, · · · , n may be viewed respectivelyas inputs to, and outputs from the channel, and the membership functional valuesμA j (xi ), as channel transition probabilities P( A j |xi ).

Example 4. Memoryless Communication Channel. Fig. 4 depicts a memorylesscommunication channel, transmitting sets A1, · · · , An of input symbols from analphabet X = {x1, · · · , xm}. The interpretation is explained under the figure title. �

3 Fuzzy Sets in a Euclidian Space

Let the universal set X be a N-dimensional Euclidian space E N over the reals, withthe scalar product of any two elements x = (x1, . . . , xN))

T and y = (y1,...,yN

)T

denoted and defined by

< x, y >= x T y =N∑

i=1

xi yi , (25)

where the superscript T denotes the transpose. If E N is a weighted Euclidian Spacewith a positive definite weight matrix R, the scalar product is given by

< x, y >= x T R−1 y (26)

As indicated before, typically suchXwould constitute the space of finite-dimensional feature vectors x associated with the objects alluded in a givendiscourse, as in the following example.

Example 5. The database of a health care clinic for the diagnosis of the conditions ofpatients w. r. t. various illnesses could constitute a universal set X . In this space, eachpatient p, would be represented by a feature vector x = x (p) ∈ E N , consisting ofN observations made on. p These observations could consist, for example, of labtest results and clinical test scores pertaining to p. Then a subset A in the featurespace X with a membership functional μA could characterize the feature vectorsof patients possibly affected by an illness �A. For a given patient p, a physicianor a computer program would assign a membership value μA(x(p)) = μA(x) to xindicating the extent of sickness, w. r. t. the illness �A , of the patient p (e.g., verywell, fairly well, okay, slightly sick, or very sick). �

Page 187: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 181

1x

2x

ix

mx

X

2A

1A

nA

Fuzzy set Aj

Fuzzy set A1

C

1ix +

kx

1kx +

2kx +

4A

3A

jA

2jA +

1jA +

1( )jA xμ

( )jA ixμ

( )jA kxμ

X

1 1 2 2 2 1 3 1 4 1 1

1 2 2 1 2 2

{ , }, { , , }, { , }, { , }, , { , , },

{ , , }, { , , }, , { , }i i i i k j i k

j k k m j k k m n k m

A x x A x x x A x x A x x A x x x

A x x x A x x x A x x

+ +

+ + + + + +

= = = = == = =

Elements of

(Universe ofDiscourse)

(Completely additiveclass of Attributes/

Events)

Fuzzy sets corresponding to the attributes shown in the figure are:

Attributes

Fig. 4 Memoryless communication channel model of a Fuzzy-Set-Based System

3.1 Membership Functionals as Vectors in the RKH Space F(E N )

Returning to our formulation, in the present case, F is the space of bounded analyticfunctionals on a bounded set ⊂ E N defined by

={

x ∈ E N : ‖x‖ ≤ γ}

(27)

where γ is a positive constant selected according to a specific application. The con-ditions under which F is defined and the mathematical properties of F are given inthe Appendix.

Page 188: Forging New Frontiers: Fuzzy Pioneers I

182 R. J.P. de Figueiredo

Remark 6. It is important to recall that, under the assumption made in this paper,the fuzzy sets A under consideration belong to �, the membership functionals μA

belong to the positive cone in F

Λ = {μA ∈ F : μA(x) ≥ 0}, (28)

and satisfy the condition.

μA(x) ∈ [0, 1] ∀ x ∈ � ⊂ E N (29)

Our algorithms for modeling and processing μA, these constraints are satisfied. �

In view of the analyticity and other conditions satisfied by members of F , statedin the Appendix, any μ pertaining to a fuzzy set belonging to � can be representedin the form of an N-variable power series, known as an abstract Volterra series, inthese N variables, absolutely convergent at every x ∈ �, expressible by:

μ(x) =∞∑

n=0

1

n!μn(x) (30)

where μn are homogeneous Hilbert-Schmidt(H-S) polynomials of degree n in thecomponents of x , a detailed representation of which is given in the Appendix.

The scalar product of any μA and μB ∈ F , corresponding to fuzzy sets A and Bis defined by

〈μA, μB〉F =∞∑

n=0

(1

λn

1

n!

∑|k|=n

|k|!k!

ckdk) (31)

where λn , n = 0, 1, 2, . . ., is a sequence of positive weights expressing the prioruncertainty respectively in the terms μn , n = 0, 1, 2, . . ., satisfying (A11) in theAppendix, and ck and dk are defined for μA and μB in the same way as ck is definedfor μ in (A4) and (A8) in the Appendix.

The reproducing kernel K (x, z), in F is

K (x, z)def= ϕ (〈x, z〉) =

∞∑n=0

λn

n!(〈x, z〉)n, (32)

where 〈x, z〉 denotes the scalar product of x and z in E N .In the special case in which λn = λn

0, ϕ is an exponential function, and thusK (x, z) becomes

K (x, z) = exp (λ0 〈x, z〉) . (33)

Page 189: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 183

Remark 7. For a simple explanation of the above formulation for the case in whichfuzzy sets are intervals on a real line, please see [16].

3.2 Best Approximation of a Membership Functional in F

Now we address the important issue of the recovery of a membership functionalμ, associated with a fuzzy set A ∈ � from a set of training pairs (vi , ui ), i =1, 2, . . . , q , i.e.,

{(vi , ui ) : vi ∈ A ⊂ E N , ui = μ(vi ) ∈ R1, i = 1, · · · , q

}, (34)

and under the positivity constraint (4) which we re-write here as

0 ≤ μ(x) ≤ 1, x ∈ A (35)

The problem of best approximation μ of μ can be posed as the solution of theoptimization problem in F of the problem

inf sup ||μ− μ||Fμ ∈ F μ ∈ F

subject to (18) and (19)

(36)

This is a quadratic programming problem in F that can be solved by standardalgorithms available in the literature. However, a procedure that usually works well(see [10]) is the one which solves (20) recursively using a RLS algorithm under (18)alone. In such a learning process setting, the pairs in (18) are used sequentially anda new pair is added whenever (19) is violated. This procedure is continued until (19)is satisfied. Such a procedure leads to the following closed form for μ, where thepairs (vi , ui ) are all the pairs used until the end of the procedure,

μ(x) = uT G−1 K (x) (37)

where

K (x) =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

ϕ(⟨v1, x

⟩)

ϕ(⟨v2, x

⟩)...

ϕ (〈vq , x〉)

⎞⎟⎟⎟⎟⎟⎟⎟⎠= (s1(x), s2(x), . . . , sq (x))T (38)

Page 190: Forging New Frontiers: Fuzzy Pioneers I

184 R. J.P. de Figueiredo

u =

⎛⎜⎝

u1

...

uq

⎞⎟⎠ (39)

and G is a q × q matrix with elements Gij , i, j = 1, · · · , q , defined by

Gij = ϕ(⟨vi , v j

⟩)(40)

In terms of the above, another convenient way of expressing (37) is

μ(x) = w1ϕ(⟨

v1, x⟩)

+w1ϕ(⟨

v2, x⟩)

· · ·+wqϕ

(⟨vq , x

⟩)

= w1s1(x)+w2s2(x)+ ...wqsq (x)

(41)

where wi are the components of the vector w obtained by

w = G−1u (42)

Remark 8. We repeat that It is important to note that the functional μ expressed by(37) or, equivalently (41), need not satisfy the condition (35) required for it to be amembership functional. If this condition is violated at any point, say x0 ∈ X then x0and its observed membership value at x0 are inserted as an additional training pair(vq+1,, uq+1) in the training set (34) and the procedure of computing the parametervector w is repeated. This process is continued using the RLS algorithm in until itconverges to a membership functional that satisfies (35). For details on a generalrecursive learning algorithm that implements this process see [10].

3.3 Neural Network Realization of a Membership Functionalin the Space F

The structure represented by (37) or, equivalently (41), is shown in the block di-agram of Fig. 5. It is clear from this figure that this structure corresponds to atwo-hidden layer artificial neural network, with the synapses and activation func-tions labeled according to the symbols appearing in (37) and (41). Such an artificialneural network, as an optimal realization of an input output map in F based on atraining set as given by (34), was first introduced by de Figueiredo in 1990 [8, 9]and called by him an Optimal Interpolative Neural Network (OINN). It now turnsout, as developed above, that such a network is also an optimal realization of themembership functional of a fuzzy set in �.

Page 191: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 185

Fig. 5 Optimal realization ofthe membership functionalμA of a fuzzy set A ∈ �

1s 2s is qs

iw

1

feature vector x

x 2x Nx

ijv

μA(x)

4 Fuzzy Sets in L2

Two important cases of applications in which the universal set X is infinite-dimensional are those in which the objects in X are waveforms x = {x(t) : a ≤t ≤ b} or images x = {x(u, v) : a ≤ u ≤ b, c ≤ d}. By simply replacing theformulas for the scalar product in X given by (25) or (26) for the Euclidian case tothe present case, all the remaining developments follow in the same way as before,with the correct interpretation of the inner products and with the understanding thatsummations now become integrations.

For example, the scalar product, analogous to (25) and (26), for the case of twowaveforms x and y would be

b∫

a

x(t)y(t)dt (43)

and

b∫

a

x(t)R−1(t, s)y(s)dtds (44)

Then the synaptic weight summations in Fig. 5 would be converted into integralsrepresenting matched filters matched to the prototype waveforms in the respectivefuzzy set. The author called such networks dynamic functional artificial neuralnetworks (D-FANNs) and provided a functional analytic method for their analysisin [12] and application in [14].

Page 192: Forging New Frontiers: Fuzzy Pioneers I

186 R. J.P. de Figueiredo

5 Applications

Due to limitations in space, it is not possible to dwell on applications, except tobriefly mention the following example of an analysis of a database of Brain Spec-trogram image feature vectors x belonging to two mutually exclusive fuzzy setsA and B of images of patients with possible Alzheimer’s and vascular dementia(For details, see [13]).

Figure 6 shows the images of 12 slices extracted from a patient brain by singlephoton emission with computed tomography (SPECT) hexamethylphenylethylene-neaminieoxime technetium-99, abbreviated as HMPAO-99Tc. The components of

Fig. 6 Images of slices of Brain Spectrogram images of a prototype patient. The template framesare used to extract the components of the feature vector x characterizing the patient

Page 193: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 187

μA(x) μB(x)

x1 x2 xN

feature vector x

Fuzzy set A Fuzzy set BClassification

Diagnosis Class

Scores

Prototype Patients

Correlations

SPECT ImageRegions-Of-Interest (ROI)

Intensities

α1 α2 β2β1

Fig. 7 An OINN (Optimal Interpolative Neural Network) used to realize the membership func-tionals of two fuzzy sets of patients with two different types of dementia

the feature vector x are the average intensities of the image slices in the regions ofinterest in the templates displayed in the lower part of Fig. 2. This feature vectorwas applied as input to the OINN shown in Fig. 7, which is self explanatory. Theresults of the study, including a comparison of the performance of a machine and ahuman expert in achieving the required objective, are displayed in Table 1.

Table 1 Result of the study of dementia based on the fuzzy set vector functional membershipanalysis

Source of Classification Rate of Correct Classification∗

Fuzzy-Set-Based Classification 81%Radiological Diagnostician 77%∗41 subjects: 15 Probable AD, 12 Probable VD, 10 Possible VD, 4 Normal.

6 Conclusion

For the case in which the universe of discourse X is a vector space, we presenteda framework for modeling and processing fuzzy sets based on the representationof their membership functionals as vectors belonging to a Reproducing KernelHilbert Space F(X) of analytic functionals. A number of interesting properties ofsuch models have been described, including that of best approximation of mem-bership functionals as vectors in F(X) under training data constraints. These best

Page 194: Forging New Frontiers: Fuzzy Pioneers I

188 R. J.P. de Figueiredo

approximation models of membership functionals naturally lead to their realizationas artificial neural networks. Due to the one-to-one correspondence between fuzzysets and their membership functionals, the benefits of vector processing of the letterare automatically carried to fuzzy sets to which they belong. Potential applicationof the underlying technology has been illustrated by an example of computationalintelligent diagnosis of Alzheimer’s disease.

References

1. Zadeh, L.A., “Fuzzy sets”, Information and Control, 8, pp.338–353, 19652. Klir, G. J. and Yuan, B., Fuzzy Sets, Fuzzy Logic, and Fuzzy Systems – Selected Papers by

Lotfi A. Zadeh, World Scientific Publishing Co., 1996.3. Klir, G.J. and T .A. Folger,Fuzzy Sets, Uncertainty, and Information, Prentice Hall, 1988.4. Ruan, Da (Ed.) “Fuzzy Set Theory and Advanced Mathematical Applications”, Kluwer

Academic, 19955. Bezdek, J. C. and Pal, S. K. (Eds.), “Fuzzy Models for Pattern Recognition”, IEEE Press, 19926. R.J.P. de Figueiredo and T.A.W. Dwyer “A best approximation framework and implementation

for simulation of large-scale nonlinear systems”, IEEE Trans. on Circuits and Systems, vol.CAS-27, no. 11, pp. 1005–1014, November 1980.

7. R.J.P. de Figueiredo “A generalized Fock space framework for nonlinear system and signalanalysis,” IEEE Trans. on Circuits and Systems, vol. CAS-30, no. 9, pp. 637–647, September1983 (Special invited issue on “Nonlinear Circuits and Systems”)

8. R.J.P. de Figueiredo “A New Nonlinear Functional Analytic Framework for ModelingArtificial Neural Networks” (invited paper), Proceedings of the 1990 IEEE InternationalSymposium on Circuits and Systems, New Orleans, LA, May 1–3, 1990, pp. 723–726.

9. R.J.P. de Figueiredo “An Optimal Matching Score Net for Pattern Classification”, Proceedingsof the 1990 International Joint Conference on Neural Networks (IJCNN-90), San Diego, CA,June 17–21, 1990, Vol. 2, pp. 909–916.

10. S.K. Sin and R.J.P. de Figueiredo, “An evolution-oriented learning algorithm for theoptimal Interpolative neural net”, IEEE Trans. Neural Networks, Vol. 3, No. 2, March 1992,pp. 315–323

11. R. J.P. de Figueiredo and Eltoft, T., “Pattern classification of non-sparse data using optimalinterpolative nets”, Neurocomputing, vol.10, no.4, pp. 385–403, April 1996

12. R. J.P. de Figueiredo, “Optimal interpolating and smoothing functional artificial neuralnetworks (FANNs) based on a generalized Fock space framework,” Circuits, Systems, andSignal Processing, vol.17, (no.2), pp. 271–87, Birkhauser Boston, 1998.

13. R. J.P. de Figueiredo, W.R. Shankle, A. Maccato, M.B. Dick, P.Y. Mundkur, I. Mena,C.W.Cotman, “Neural-network-based classification of cognitively normal, demented,Alzheimer’s disease and vascular dementia from brain SPECT image data”, Proceedings ofthe National Academy of Sciences USA, vol 92, pp. 5530–5534, June 1995.

14. T. Eltoft and R.J.P de Figueiredo, “Nonlinear adaptive time series prediction with aDynamical- Functional Artificial Neural Network”, IEEE Transactions on Circuist andSystems, Part II, Vol.47 no.10, Oct.2000, pp.1131–1134

15. R.J.P. de Figueiredo “Beyond Volterra and Wiener: Optimal Modeling of NonlinearDynamical Systems in a Neural Space for Applications in Computational Intelligence”,in Computational Intelligence: The Experts Speak, edited by Charles Robinson and DavidFogel, volume commemorative of the 2002 World Congress on Computational Intelligence,published by IEEE and John Wiley & Sons, 2003.

16. R. J. P. de Figueiredo, “Processing fuzzy set membership functionals as vectors”, to appearin a Special Issue of ISCJ honoring the 40th anniversary of the invention of fuzzy sets byL. A. Zadeh, International Journal on Soft Computing, 2007 (in press).

Page 195: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 189

APPENDIX

Brief overview of the Space F

In this Appendix we briefly review some definitions and results on the space F(X)

invoked in the body of the paper.For the sake of brevity, we focus on the case in which X is the N-dimensional

Euclidian space E N . The developments also apply, of course, when X is any separa-ble Hilbert space, such as l2, the apace of square-summable strings of real numbersof infinite length (e.g., discrete time signals), or L2, the space of square-integrablewaveforms or images.

Thus let F consist of bounded analytic functionals μ on E N [We denote membersof F by μ because, in this paper, they are candidates for membership functionalsof fuzzy sets in �].Then such functionals μ can be represented by abstract power(Vollerra functional) series on E N satisfying the following conditions.

(a) μ is a real analytic functional on a bounded set ⊂ E N defined by

={

x ∈ E N : ‖x‖ ≤ γ}

(A1)

where γ is a positive constant. This implies that there is an N-variable power seriesknown as a Volterra series in these N variables, absolutely convergent at every x ∈, expressible by:

μ(x) =∞∑

n=0

1

n!μn(x) (A2)

where μn are homogeneous Hilbert-Schmidt(H-S) polynomials of degree n in thecomponents of x, given by

μn(x) =n∑

k1=0|k|=k1+k2+···kN=n

· · ·n∑

kN=0

ck1 ···kN ·

|k|!k1! · · · kN !

xk1N · · · xkN

N

(A3)

=∑|k|=n

ck|k|!k!

xk (A4)

where, in the last equality, we have used the notation

Page 196: Forging New Frontiers: Fuzzy Pioneers I

190 R. J.P. de Figueiredo

k = (k1, · · · , kN ) (A5)

|k| =N∑

i=1

ki (A6)

k! = k1!, · · · , kN ! (A7)

ck = ck1···kN (A8)

xk = xk11 · · · xkN

1 (A9)

(b) Let there be given a sequence of positive numbers

λ = {λ0, λ1 · · ·} , (A10)

where the weights λn express prior uncertainty in the terms μn , n = 0, 1, 2, . . ., andsatisfy

∞∑n=0

1

λn

γ 2n

n!<∞ (A11)

Actually, some elements of λ, namely λk , k ∈ S, where S is a subset of non-negativeintegers, may be allowed to be zero, if we assume that f belongs to a subspace of Fconsisting of powers series in F with the terms of degree k ∈ S deleted.

(c) Finally, the coefficients ck in the terms μn of the abstract power series expan-sion of the membership function satisfy the restriction

∞∑n=0

1

n!λn

∑|k|=n

|k|!k!|ck |2 <∞ (A12)

The above allows us to state the following theorem. For a proof, see [6].

Theorem 1. (de Figueiredo / Dwyer) [6].Under (A1), (A2),(A11), and (A12), the completion of the set of nonlinear func-

tionals μ in (A2) constitutes a Reproducing Kernel Hilbert (RKHS) F(E N ) = F on, with:

(i) The scalar product between any μA and μB ∈ F, corresponding to fuzzy setsA and B is defined by

〈μA, μB〉F =∞∑

n=0

(1

λn

1

n!

∑|k|=n

|k|!k!

ckdk) (A13)

where ck and dk are defined for μA and μB in the same way as ck is defined fordefined for μ in.(A4) and (A8).

Page 197: Forging New Frontiers: Fuzzy Pioneers I

A Nonlinear Functional Analytic Framework for Modeling and Processing Fuzzy Sets 191

(ii) The reproducing kernel, K (x, z), in F is

K (x, z)def= ϕ (〈x, z〉) =

∞∑n=0

λn

n!(〈x, z〉)n (A14)

where 〈x, z〉 denotes the inner product in E N . In the special case that λn = λn0ϕ is

an exponential function, and thus K (x, z) becomes

K (x, z) = exp (λ0 〈x, z〉) (A15)

It is easily verified that the K (x, z) defined as the reproducing property

< K (x, .), μ(.) >F= μ(x) (A16)

Methods for best approximation of a functional in F , on which the development sin section 2.3 are based, are described in [6, 7, 8, 9, 10] and [15].

Page 198: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Searchand Questionnaire Systems

Masoud Nikravesh

Abstract World Wide Web search engines including Google, Yahoo and MSN havebecome the most heavily-used online services (including the targeted advertising),with millions of searches performed each day on unstructured sites. In this presen-tation, we would like to go beyond the traditional web search engines that are basedon keyword search and the Semantic Web which provides a common framework thatallows data to be shared and reused across application,. For this reason, our view isthat “Before one can use the power of web search the relevant information has tobe mined through the concept-based search mechanism and logical reasoning withcapability to Q&A representation rather than simple keyword search”.

In this paper, we will first present the state of the search engines. Then we willfocus on development of a framework for reasoning and deduction in the web. A newweb search model will be presented. One of the main core ideas that we will use toextend our technique is to change terms-documents-concepts (TDC) matrix into arule-based and graph-based representation. This will allow us to evolve the tradi-tional search engine (keyword-based search) into a concept-based search and theninto Q&A model. Given TDC, we will transform each document into a rule-basedmodel including it’s equivalent graph model. Once the TDC matrix has been trans-formed into maximally compact concept based on graph representation and rulesbased on possibilistic relational universal fuzzy–type II (pertaining to composition),one can use Z(n)-compact algorithm and transform the TDC into a decision-treeand hierarchical graph that will represents a Q&A model. Finally, the concept ofsemantic equivalence and semantic entailment based on possibilistic relational uni-versal fuzzy will be used as a basis for question-answering (Q&A) and inferencefrom fuzzy premises. This will provide a foundation for approximate reasoning,language for representation of imprecise knowledge, a meaning representation lan-guage for natural languages, precisiation of fuzzy propositions expressed in a naturallanguage, and as a tool for Precisiated Natural Language (PNL) and precisation ofmeaning. The maximally compact documents based on Z(n)-compact algorithm and

Masoud NikraveshBISC Program, Computer Sciences Division, EECS Department and Imaging and InformaticsGroup-LBNL, University of California, Berkeley, CA 94720, USAe-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 193C© Springer 2007

Page 199: Forging New Frontiers: Fuzzy Pioneers I

194 M. Nikravesh

possibilistic relational universal fuzzy–type II will be used to cluster the documentsbased on concept-based query-based search criteria.

Keywords: Semantic web · Fuzzy query · Fuzzy search · PNL · NeuSearch · Z(n)-compact · BISC-DSS

1 Introduction

What is Semantic Web? The Semantic Web is an extension of the current web inwhich information is given well-defined meaning, better enabling computers andpeople to work n cooperation.” – Tim Berners-Lee, James Hendler, Ora Lassila, TheSemantic Web, Scientific American, May 2001

“The Semantic Web provides a common framework that allows data to be sharedand reused across application, enterprise, and community boundaries. It is a col-laborative effort led by W3C with participation from a large number of researchersand industrial partners. It is based on the Resource Description Framework (RDF),which integrates a variety of applications using XML for syntax and URIs for nam-ing.” – W3C organization (http://www.w3.org/2001/sw/)

“Facilities to put machine-understandable data on the Web are becoming a highpriority for many communities. The Web can reach its full potential only if itbecomes a place where data can be shared and processed by automated tools aswell as by people. For the Web to scale, tomorrow’s programs must be able toshare and process data even when these programs have been designed totally in-dependently. The Semantic Web is a vision: the idea of having data on the webdefined and linked in a way that it can be used by machines not just for display pur-poses, but for automation, integration and reuse of data across various applications.”(http://www.w3.org/2001/sw/).

Semantic Web is a mesh or network of information that are linked up in such away that can be accessible and be processed by machines easily, on a global scale.One can think of Semantic Web as being an efficient way of representing and sharingdata on the World Wide Web, or as a globally linked database. It is important tomention that Semantic Web technologies are still very much in their infancies andthere seems to be little consensus about the likely characteristics of such system. Itis also important to keep in mind that the data that is generally hidden in one way orother is often useful in some contexts, but not in others. It is also difficult to use ona large scale such information, because there is no global system for publishing datain such a way as it can be easily processed by anyone. For example, one can thinkof information about local hotels, sports events, car or home sales info, insurancedata, weather information stock market data, subway or plane times, Major LeagueBaseball or Football statistics, and television guides, etc.. . .. All these informationare presented by numerous web sites in HTML format. Therefore, it is difficult touse such data/information in a way that one might wanted to do so. To build anysemantic web-based system, it will become necessary to construct a powerful log-ical language for making inferences and reasoning such that the system to become

Page 200: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 195

expressive enough to help users in a wide range of situations. This paper will tryto develop a framework to address this issue. Figure 1 shows Concept-Based Web-Based Databases for Intelligent Decision Analysis; a framework for next generationof Semantic Web. In the next sections, we will describe the components of suchsystem.

2 Mining Information and Questionnaire System

The central tasks for the most of the search engines can be summarize as 1) queryor user information request- do what I mean and not what I say!, 2) model for theInternet, Web representation-web page collection, documents, text, images, music,etc, and 3) ranking or matching function-degree of relevance, recall, precision, sim-ilarity, etc. One can use clarification dialog, user profile, context, and ontology, intoan integrated frame work to design a more intelligent search engine. The modelwill be used for intelligent information and knowledge retrieval through conceptualmatching of text. The selected query doesn’t need to match the decision criteriaexactly, which gives the system a more human-like behavior. The model can alsobe used for constructing ontology or terms related to the context of search or queryto resolve the ambiguity. The new model can execute conceptual matching dealingwith context-dependent word ambiguity and produce results in a format that permitsthe user to interact dynamically to customize and personalized its search strategy. Itis also possible to automate ontology generation and document indexing using theterms similarity based on Conceptual-Latent Semantic Indexing Technique (CLSI).Often time it is hard to find the “right” term and even in some cases the term doesnot exist. The ontology is automatically constructed from text document collectionand can be used for query refinement. It is also possible to generate conceptualdocuments similarity map that can be used for intelligent search engine based onCLSI, personalization and user profiling.

The user profile is automatically constructed from text document collection andcan be used for query refinement and provide suggestions and for ranking the in-formation based on pre-existence user profile. Given the ambiguity and imprecisionof the “concept” in the Internet, which may be described by both textual and imageinformation, the use of Fuzzy Conceptual Matching (FCM) is a necessity for searchengines.

In the FCM approach (Figs. 2 through 4), the “concept” is defined by a seriesof keywords with different weights depending on the importance of each keyword.Ambiguity in concepts can be defined by a set of imprecise concepts. Each impre-cise concept in fact can be defined by a set of fuzzy concepts. The fuzzy conceptscan then be related to a set of imprecise words given the context. Imprecise wordscan then be translated into precise words given the ontology and ambiguity resolu-tion through clarification dialog. By constructing the ontology and fine-tuning thestrength of links (weights), we could construct a fuzzy set to integrate piecewise theimprecise concepts and precise words to define the ambiguous concept. To developFCM, a series of new tools are needed. The new tools are based on fuzzy-logic-based

Page 201: Forging New Frontiers: Fuzzy Pioneers I

196 M. Nikravesh

1.1. 3

SF

C

tinU

RE

SU

srelwar

C-sredipS

dexednIsega

P beW

xednI hcraeS

resU

yreuQ

laveirt eR

beW

MV

S

-n

Udes ivrep

uS

gnirets

ulC

BD

SF

C

tin

U

beW

re zylan

A

smreT

kniLnI

kniLt

uO

fdi.ft

fdi.ft .vq

E

fdi.ft .v q

E

noitager

ggA

MO

S

ytinu

mmo

Cre

dliu

B

reni

MataD

noitaz il ausiV

ega

m Irezyla

nA

rezylan

A e

gamI

noitat

on n

A d

na

yreuQ e

gamI

laveirteR

dna

ega

mIn

o itato

nn

A eg a

mIno itca rt x

E

noitaziram

mu

S

no itc

udeD

metsyS

A/Q

noit aziram

mu

Sn

oitcu

d eD

metsyS

A/Q

d eru tc

ur tS

noitamrofnI

beW c it na

meS

metsy

S tne

gilletnI re zyla

nA

sison

g orP -sis

onga i

D . e.i

e gd el

won

K strep xE

g

n idulc

nI n

oitatneser

peR led

oM

noitalu

mro

F citsiu

gniL

•stne

meriuqeR lanoitcnu

F •stniartsno

C •sevitcejb

O dna slaoG •

tnemeriuqe

R selbairaV citsiugni

L

morF t

up

nIsreka

M n

oisiceD

tn eme

g ana

M led

oM •

yreu

Q•

noita

gerg

gA

•g

ni kna

R•

noita

ulavE s

senti

F

lenre

K yranoitul

ovE

AN

D dna ,

gni

mmarg

orP cite

neG ,

mhtiro

glA citene

G

•n

oitc eleS

•rev

O ssor

C•

noitatu

M

d

na led

oM

noitazila

usiV at a

D

ataD

t neme

gana

M

noitamrof

nI d erutc

urtS-

nU

resU

ec af ret n I re sU

noitc

n uF

gola iD

rot i

dE esa

B eg del

w on

K

eni

gnE ec

nerefnI

,eci vd

A ,noita

dne

mmoce

Rn

oitanalpxE

dna

eg

delw

on

Ktne

menifeR

ataD

N

EH

T…

FI el

uR

eg

delwo

nK

reenig

nE f

oesaB e

gdel

wo n

K

rof ksa sres

u r

o eciv da e

divor

psecnerefer

p

& sec

nerefni

nois

ulcn

oc

dna resu e

ht sesivda

cigol e

ht snialpxe

si esi trepxe

dna

derref snart

derots si ti

secruo

S ataD

esu

oheraW

dna

)sesaba tad(

,no ita t

neserpe

R eg

delw o

nK

dna n

oitazilausiV ata

D evitcaret

nI lau siV

gnika

M noisice

D

eg

delw

onK

y revoc si

Dg

nin iM a ta

D dna

etarene

Ge

gde l

wo

nK

ez inag rO

sesaB e

gdel

won

K tre

pxE

egdel

won

K

SF

C

tinU

RE

SU

srelwar

C-sredipS

dexednIsega

P beW

xednI hcraeS

resU

yreuQ

laveirt eR

MV

S

-n

Udes ivrep

uS

gnirets

ulC

BD

SF

C

tin

U

beW

re zylan

A

smreT

kniLnI

kniLt

uO

fdi.ft

fdi.ft .vq

E

fdi.ft .v q

E

noitager

ggA

MO

S

ytinu

mmo

Cre

dliu

B

reni

MataD

noitaz il ausiV

ega

m Irezyla

nA

rezylan

A e

gamI

noitat

on n

A d

na

yreuQ e

gamI

laveirteR

dna

ega

mIn

o itato

nn

A eg a

mIno itca rt x

E

noitaziram

mu

S

no itc

udeD

metsyS

A/Q

noit aziram

mu

Sn

oitcu

d eD

metsyS

A/Q

d eru tc

ur tS

noitamrofnI

beW c it na

meS

metsy

S tne

gilletnI re zyla

nA

sison

g orP -sis

onga i

D . e.i

e gd el

won

K strep xE

g

n idulc

nI n

oitatneser

peR led

oM

noitalu

mro

F citsiu

gniL

•stne

meriuqeR lanoitcnu

F •stniartsno

C •sevitcejb

O dna slaoG •

tnemeriuqe

R selbairaV citsiugni

L

morF t

up

nIsreka

M n

oisiceD

tn eme

g ana

M led

oM •

yreu

Q•

noita

gerg

gA

•g

ni kna

R•

noita

ulavE s

senti

F

lenre

K yranoitul

ovE

AN

D dna ,

gni

mmarg

orP cite

neG ,

mhtiro

glA citene

G

•n

oitc eleS

•rev

O ssor

C•

noitatu

M

d

na led

oM

noitazila

usiV at a

D

ataD

t neme

gana

M

noitamrof

nI d erutc

urtS-

nU

resU

ec af ret n I re sU

noitc

n uF

gola iD

rot i

dE esa

B eg del

w on

K

eni

gnE ec

nerefnI

,eci vd

A ,noita

dne

mmoce

Rn

oitanalpxE

dna

eg

delw

on

Ktne

menifeR

ataD

N

EH

T…

FI el

uR

eg

delwo

nK

reenig

nE f

oesaB e

gdel

wo n

K

rof ksa sres

u r

o eciv da e

divor

psecnerefer

p

& sec

nerefni

nois

ulcn

oc

dna resu e

ht sesivda

cigol e

ht snialpxe

si esi trepxe

dna

derref snart

derots si ti

secruo

S ataD

esu

oheraW

dna

)sesaba tad(

,no ita t

neserpe

R eg

delw o

nK

dna n

oitazilausiV ata

D evitcaret

nI lau siV

gnika

M noisice

D

eg

delw

onK

y revoc si

Dg

nin iM a ta

D dna

etarene

Ge

gde l

wo

nK

ez inag rO

sesaB e

gdel

won

K tre

pxE

egdel

won

K

enignE be

W citnameS sa evreS

beW

Fig

.1C

once

pt-B

ased

Web

-Bas

edD

atab

ases

for

Inte

llig

entD

ecis

ion

Ana

lysi

s;a

fram

ewor

kfo

rne

xtge

nera

tion

ofSe

man

tic

Web

Page 202: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 197

[ 0, 1] [ tf-idf]

[set]

Term-Document Matrix

The use of Fuzzy Set Theory

The use of statistical-Probabilistic TheoryThe use of bivalent-logic Theory

The use of Fuzzy Set-Object-Based Theory

Specialization

Fig. 2 Evolution of Term-Document Matrix representation

method of computing with words and perceptions (CWP [1, 2, 3, 4]), with the under-standing that perceptions are described in a natural language [5, 6, 7, 8, 9] and state-of-the-art computational intelligence techniques [10, 11, 12, 13, 14, 15]. Figure 2shows the evolution of Term-Document Matrix representation. The [0,1] represen-tation of term-document matrix (or in general, storing the document based on thekeywords) is the simplest representation. Most of the current search engines such asGoogleTM, Teoma, Yahoo!, and MSN use this technique to store the term-document

Fuzzy Conceptual Similarityp (q1 to qi) &

p(r1 to rj) & Doc-Doc Similarity Based on Fuzzy

Similarity Measure

Rules between Authorities, Hubs and Doc-Doc

p

r1q1

q1

q2

q3

r1

r2

r3

Fig. 3 Fuzzy Conceptual Similarity

Page 203: Forging New Frontiers: Fuzzy Pioneers I

198 M. Nikravesh

segapbeW

Webpages

RX’=

1 (Text_Sim, In_Link, Out_Link, Rules, Concept ) 2 (…) 0 (…) 0 (…)

0 (Text_Sim, In_Link, Out_Link, Rules, Concept ) 1 (…) 1(…) 6 (…)

2 (Text_Sim, In_Link, Out_Link, Rules, Concept ) 0 (…) 5 (…) 4 (…)

0 (Text_Sim, In_Link, Out_Link, Rules, Concept ) 1 (…) 4 (…) 0 (…)

Text_sim: Based on Conceptual Term-Doc Matrix; It is a Fuzzy Set

In_Link & Out_Link: Based on the Conceptual Links which include actual links and virtual links; It is a Fuzzy Set

Rules: Fuzzy rules extracted from data or provided by user

Concept: Precisiated Natural Language definitions extracted from data or provided by user

Fig. 4 Matrix representation of Fuzzy Conceptual Similarity model

information. One can extend this model by the use of ontology and other similaritymeasures. This is the core idea that we will use to extend this technique to FCM. Inthis case, the existence of any keyword that does not exist directly in the documentwill be decided through the connection weight to other keywords that exist in thisdocument. For example consider the followings:

• if the connection weight (based on the ontology) between term “i” (i.e. Automo-bile) and term “j” (i.e. Transportation) is 0.7; and the connection weight betweenterm “j” (i.e. Transportation) and term “k” (i.e. City) is 0.3;

◦ wij = 0.7◦ wjk = 0.3

• and term “i” doesn’t exist in document “I”, term “j” exists in document “I”, andterm “k” does not exist in document I

◦ TiDI = 0◦ TiDI = 1◦ TkDI = 0

• Given the above observations and a threshold of 0.5, one can modify the term-document matrix as follows:

◦ TiDI’ = 1◦ TiDI’ = 1◦ TkDI’ = 0

• In general one can use a simple statistical model such as the concurrence matrixto calculate wij [5, 12].

Page 204: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 199

An alternative to the use of [1,0] representation is the use of the tf-idf (termfrequency-inverse document frequency) model. In this case, each term gets a weightgiven its frequency in individual documents (tf, frequency of term in each docu-ment) and its frequency in all documents (idf, inverse document frequency). Thereare many ways to create a tf-idf matrix [7]. In FCM, our main focus is the use offuzzy set theory to find the association between terms and documents. Given suchassociation, one can represent each entry in the term-document matrix with a setrather then either [0,1] or tf-idf. The use of fuzzy-tf-idf is an alternative to the useof the conventional tf-idf. In this case, the original tf-idf weighting values will bereplaced by a fuzzy set representing the original crisp value of that specific term. Toconstruct such value, both ontology and similarity measure can be used. To developontology and similarity, one can use the conventional Latent Semantic Indexing(LSI) or Fuzzy-LSI [7]. Given this concept (FCM), one can also modify the linkanalysis (Fig. 3) and in general Webpage-Webpage similarly (Fig. 4). More infor-mation about this project and also a Java version of Fuzzy Search Tool (FST) thatuses the FCM model is available at http://www.cs.berkeley.edu/∼nikraves/fst//SFTand a series of papers by the author at Nikravesh, Zadeh and Kacprzyk [12]. Cur-rently, we are extending our work to use the graph theory to represent the term-document matrix instead of the use of fuzzy set. While, each step increases thecomplexity and the cost to develop the FCM model, we believe this will increasethe performance of the model given our understanding based on the results thatwe have analyzed so far. Therefore, our target is to develop a more specializedand personalized model with better performance rather than a general, less per-sonalized model with less accuracy. In this case, the cost and complexity will bejustified.

One of the main core ideas that we will use to extend our technique is to changeterms-documents-concepts (TDC) matrix into a rule and graph. In the followingsection, we will illustrate how one can build such a model.

Consider a terms-documents-concepts (TDC) matrix presented as in Table 1where (Note that TDC entries (kii

jj) could be crisp number, tf-idf values, setor fuzzy-objects (including the linguistic labels of fuzzy granular) as shown inFig. 2).

Di : Documents; where i = 1 . . . m (in this example 12)Keyj : Terms/Keywords in documents; where j = 1 . . . n (in this example 3)Cij : Concepts; where ij = 1 . . . l (in this example 2)

One can use Z(n)-compact algorithm (Sect 3.1.7) to represent the TDC matrix withrule-base model. Table 2 shows the intermediate results based on Z(n)-compact al-gorithm for concept 1. Table 3 shows how the original TDC (Table 1) matrix isrepresented in final pass with a maximally compact representation. Once the TDCmatrix represented by a maximally compact representation (Table 3), one can trans-late this compact representation with rules as presented in Tables 4 and 5. Table 6shows the Z(n)-compact algorithm. Z(n) Compact is the basis to create web-websimilarly as shown in Fig. 4.

Page 205: Forging New Frontiers: Fuzzy Pioneers I

200 M. Nikravesh

Table 1 Terms-Documents-Concepts (TDC) Matrix

Terms-Documents-Concepts

Documents key1 key2 key3 Concepts

D1 k11 k1

2 k13 c1

D2 k11 k2

2 k13 c1

D3 k21 k2

2 k13 c1

D4 k31 k2

2 k13 c1

D5 k31 k1

2 k23 c1

D6 k11 k2

2 k23 c1

D7 k21 k2

2 k23 c1

D8 k31 k2

2 k23 c1

D9 k31 k1

2 k13 c2

D10 k11 k1

2 k23 c2

D11 k21 k1

2 k13 c2

D12 k21 k1

2 k23 c2

Table 2 Intermediate results for Z(n)-compact algorithm

The Intermediate Results/Iterations

Documents key1 key2 key3 C

D1 k11 k1

2 k13 c1

D2 k11 k2

2 k13 c1

D3 k21 k2

2 k13 c1

D4 k31 k2

2 k13 c1

D5 k31 k1

2 k23 c1

D6 k11 k2

2 k23 c1

D7 k21 k2

2 k23 c1

D8 k31 k2

2 k23 c1

D2, D3, D4∗ k2

2 k13 c1

D6, D7, D8∗ k2

2 k23 c1

D5, D8 k31 ∗ k2

3 c1

D2, D6 k11 k2

2 ∗ c1

D3, D7 k21 k2

2 ∗ c1

D4, D8 k31 k2

2 ∗ c1

D2, D6 ∗ k22 ∗ c1

D2, D3, D4, D6, D7, D8

D1 k11 k1

2 k13 c1

D5, D6 k31 ∗ k2

3 c1

D2, D3, D4, D6, D7, D8 ∗ k22 ∗ c1

Table 3 Maximally Zn-compact representation of TDC matrix

Documents key1 key2 keyn Concepts

D1 k11 k1

2 k13 c1

D5, D8 k31 ∗ k2

3 c1

D2, D3, D4, D6, D7, D8∗ k2

2 ∗ c1

D9 k31 k1

2 k13 c2

D10 k11 k1

2 k23 c2

D11, D12 k21 k1

2 ∗ c2

Page 206: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 201

Table 4 Rule-based representation of Z(n)-compact of TDC matrix

Documents Rules

D1 If key1 is k11 and key2 is k1

2 and key3 is k13 THEN Concept is c1

D5, D8 If key1 is k31 and key3 is k2

3 THEN Concept is c1

D2, D3, D4, D6, D7, D8 If key2 is k22 THEN Concept is c1

D9 If key1 is k31 and key2 is k1

2 and key3 is k13 THEN Concept is c2

D10 If key1 is k11 and key2 is k1

2 and key3 is k23 THEN Concept is c2

D11, D12 If key1 is k21 and key2 is k1

2 THEN Concept is c2

Table 5 Rule-based representation of Maximally Z(n)-compact of TDC matrix (Alternative repre-sentation for Table 4)

Documents Rules

D1 If key1 is k11 and key2 is k1

2 and key3 is k13

ORD5, D8 If key1 is k3

1 and key3 is k23

ORD2, D3, D4, D6, D7, D8 If key2 is k2

2 THEN Concept is c1

D9 If key1 is k31 and key2 is k1

2 and key3 is k13

ORD10 If key1 is k1

1 and key2 is k12 and key3 is k2

3

ORD11, D12 If key1 is k2

1 and key2 is k12 THEN Concept is c2

Table 6 Z(n)-Compactification Algorithm

Z(n)-Compact Algorithm:The following steps are performed successively for each column j; j=1 . . . n

1. Starting with kiijj (ii=1, jj=1) check if for any kii

1(i i = 1, . . . , 3 in this case) allthe columns are the same, then kii

1 can be replaced by ∗

• For example, we can replace kii1(i i = 1, . . . , 3) with ∗ in rows 2, 3, and 4.

One can lso replace kii1 with ∗ in rows 6, 7, and 8. (Table 1, first pass).

2. Starting with kiijj (ii=1, jj=2) check if for any kii

2 (i i = 1, . . . , 3 is this case) allthe columns are the same, then kii

2 can be replaced by ∗

• For example, we can replace kii2 (ii=1, ..,3) with ∗ in rows 5 and 8. (Table 1,

first pass).

3. Repeat steps one and 2 for all jj.4. Repeat steps 1 through 3 on new rows created Row ∗ (Pass 1 to Pass nn, in this

case, Pass 1 to Pass 3).

• For example, on Rows ∗ 2,3, 4 (Pass 1), check if any of the rows given columnsjj can be replaced by ∗. In this case, kii

3 can be replaced by ∗. This will gives:∗ k2

2 ∗.• For example, on Pass 3, check if any of the rows given columns jj can be

replaced by ∗. In this case, kii1 can be replaced by ∗. This will gives: ∗ k2

2 ∗

5. Repeat steps 1 through 4 until no compactification would be possible

Page 207: Forging New Frontiers: Fuzzy Pioneers I

202 M. Nikravesh

As it has been proposed, the TDC entries could not be crisp numbers. The fol-lowing cases would be possible:

A. The basis for the kiijjs are [0 and 1]. This is the simplest case and Z(n)-compact

will work as presentedB. The basis for the kii

jjs are tf-idf or any similar statistical based values or GA-GP context-based tf-idf, ranked tf-idf or fuzzy-tf-idf. In this case, we use fuzzygranulation to granulate tf-idf into series of granular, two ([0 or 1] or [high andlow]), three (i.e. low, medium, and high), etc. Then the Z(n)-compact will workas it is presented.

C. The basis for the kiijjs are set value created based on ontology which can be created

based on traditional statistical based methods, human made, or fuzzy-ontology. Inthis case, the first step is to find the similarities between set values using statisticalor fuzzy similarly measures. BISC-DSS software has a set of similarity measures,T-norm and T-conorm, and aggregator operators for this purpose. The secondstep is to use fuzzy granulation to granulate the similarities values into series ofgranular, two ( [0 or 1] or [high and low]), three (i.e. low, medium, and high), etc.Then the Z(n)-compact will work as it is presented.

D. It is also important to note that concepts may also not be crisp. Therefore, stepsB and C could also be used to granulate concepts as it is used to granulate thekeyword entries values (kii

jjs).E. Another important case is how to select the right keywords in first place. One

can use traditional statistical or probabilistic techniques which are based on tf-idftechniques,non traditional techniquessuch asGA-GPcontext-based tf-idf, rankedtf-idf or fuzzy-tf-idf, or clustering techniques. These techniques will be used asfirst pass to select the first set of initial keywords. The second step will be basedon feature selection technique based on to maximally separating the concepts.This techniques are currently part of the BISC-DSS toolbox, which includes theZ(n)-Compact-Feature-Selection technique (Z(n)-FCS)).

F. Other possible cases: These include when the keywords are represented basedon a set of concepts (such as Chinese-DNA model) or concepts are presented asa set of keywords (traditional techniques). In the following sections, we will useNeu-FCS model to create concepts automatically and relate the keywords to theconcepts through a mesh of networks of neurons.

G. Another very important case is when the cases are not possibilistic and are ingeneral form “ IF KeyiisriKi and . . . Then Class isrcCj; where isr can be pre-sented as:r: = equality constraint: X = R is abbreviation of X is = Rr: ≤ inequality constraint: X ≤ Rr: ⊂ subsethood constraint: X ⊂ Rr: blank possibilistic constraint; X is R; R is the possibility distribution of Xr: v veristic constraint; X isv R; R is the verity distribution of Xr: p probabilistic constraint; X isp R; R is the probability distribution of Xr: rs random set constraint; X isrs R; R is the set-valued probability distri-

bution of Xr: fg fuzzy graph constraint; X isfg R; X is a function and R is its fuzzy

graph

Page 208: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 203

r: u usuality constraint; X isu R means usually (X is R)r: g group constraint; X isg R means that R constrains the attribute-values

of the group• Primary constraints: possibilistic, probabilisitic and veristic• Standard constraints: bivalent possibilistic, probabilistic and bivalent veristic

H. It is important to note that Keys also can be presented in from of grammar andlinguistic, such as subjects 1 to n; verbs 1 to m, objects 1, l, etc. In this case,each sentence can be presented in form of “isr” form or “IF . . . Then “ ruleswith “isr” or “is” format.

Once the TDC matrix has been transformed into maximally compact concept basedon graph representation and rules based on possibilistic relational universal fuzzy–type I, II, III, and IV (pertaining to modification, composition, quantification, andqualification), one can use Z(n)-compact algorithm and transform the TDC intoa decision-tree and hierarchical graph that will represents a Q&A model. Finally,the concept of semantic equivalence and semantic entailment based on possibilisticrelational universal fuzzy will be used as a basis for question-answering (Q&A)and inference from fuzzy premises. This will provide a foundation for approximatereasoning, language for representation of imprecise knowledge, a meaning represen-tation language for natural languages, precisiation of fuzzy propositions expressedin a natural language, and as a tool for Precisiated Natural Language (PNL) andprecisation of meaning. The maximally compact documents based on Z(n)-compactalgorithm and possibilistic relational universal fuzzy–type II will be used to clusterthe documents based on concept-based query-based search criteria. Tables 7 showsthe technique based on possibilistic relational universal fuzzy–type I, II, III, and IV(pertaining to modification, composition, quantification, and qualification), [3, 4]and series of examples to clarify the techniques. Types I through Types IV of pos-sibilistic relational universal fuzzy in connection with Z(n)-Compact can be usedas a basis for precisation of meaning. Tables 8 through 17 show the technique

Table 7 Four Types of PRUF, Type I, Type II, Type III and Type IV

PRUF Type I: pertaining to modification PRUF – Type III: pertaining to quantificationX is very smallX is much larger than Y Most Swedes are tallEleanor was very upset Many men are much taller than most menThe Man with the blond hair is very tall Most tall men are very intelligent

PRUF Type II: pertaining to composition PRUF – Type IV: pertaining to qualificationX is small and Y is large (conjunctive

composition)X is small or Y is large (disjunctive

composition)If X is small then Y is large (condi-

tional and conjunctivecomposition)

If X is small then Y is large else Yis very lare (conditional and con-junctiv composition)

• Abe is young is not very true (truthqualification)

• Abe is young is quite probable (proba-bility qualification

• Abe is young is almost impossible (pos-sibility qualification)

Page 209: Forging New Frontiers: Fuzzy Pioneers I

204 M. Nikravesh

Table 8 Possibilistic Relational Universal Fuzzy–Type I; Pertaining to modification, rules of typeI, basis is the modifier

Proposition pp : N is F

Modified proposition p+p+ = N is mFm: not, very, more or less, quite, extremely, etc.

μYOUNG = 1− S(25, 35, 45)

Lisa is very young→ ΠAge(Lisa) = YOUNG2

μYOUNG2 = (1− S(25, 35, 45))2

p: Vera and Pat are close friendsApproximate p→ p∗: Vera and Pat are friends2

π(F RI E N DS) = μF RI E N DS2(Name1 = V era; Name2 = Pat)

Table 9 Possibilistic Relational Universal Fuzzy–Type II – operation of composition – to be usedto compact rules presented in Table 5.

R X1 X2 . . . XnF11 F12 . . . F1n. . . . . . . . . . . .

Fm1 . . . . . . Fmn

R = X1 is F11 and X2 is F12 and . . . Xn is F1n ORX1 is F21 and X2 is F22 and . . . Xn is F2n OR

. . . . . . . . .

X1 is Fm1 and X2 is Fm2 and . . . Xn is FmnR→ (F11 × · · · × F1n)+ · · · + (Fm1 × · · · × Fmn)

Table 10 Possibilistic Relational Universal Fuzzy–Type II – pertaining to composition – mainoperators

p = q∗rq:M is Fr:N is GM is F and N is G : F ∩ G = F× G If M is F then N is G else N is HM is F or N is G = F+ G Π(X,...,Xm ,Y1,...,Yn) = (F ′ ⊕ G) ∩ (F ⊕ H )

if M is F then N is G = F⊕ G ororIf M is F then N is G = F× G+ F′ × V If M is F then N is G else N is H↔F′ = F ′ × V

((If M is F then N is G) and(If M is not F then N is H)

)

G = U × GμF×G (u, v) = μF (u) ∧ μG (V )

μF⊕G(u, v) = 1 ∧ (1− μF (u)+ μG (V ))

∧ : min,+arithmetic sum,-arithmetic difference

Page 210: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 205

Table 11 Possibilistic Relational Universal Fuzzy–Type II – Examples

X is small or Y is large:Example:

U = V = 1+ 2+ 3, Π(x,y) =⎡⎣

1 1 10.6 0.6 10.1 0.6 1

⎤⎦

M: X,N:YF: SMALL : 1/1+0.6/2+0.1/3 if X is small then Y is large:

G: LARGE : 0.1/1+0.6/2+1/3. Π(x,y) =⎡⎣

0.1 0.6 10.5 1 11 1 1

⎤⎦

X is small and Y is large:

Π(x,y) =⎡⎣

0.1 0.6 10.1 0.6 0.60.1 0.1 0.1

⎤⎦ if X is small then Y is large:

Π(x,y) =⎡⎣

0.1 0.6 10.4 0.6 0.60.9 0.9 0.9

⎤⎦

Table 12 Possibilistic Relational Universal Fuzzy–Type III, Pertaining to quantification, Rules oftype III; p: Q N are F

p = Q N are FQ : fuzzy quantifierquantifier : most, many, few, some, almostMost Swedes are tallN is F→ ΠX = FQ N are F→ Πcount (F) = Qmp↔ qq → mp(Π p)1 if m : not(Π p)2 if m : very(Π p)0.5 if m : more or less

Table 13 Possibilistic Relational Universal Fuzzy–Type III, Pertaining to quantification, Rules oftype III; p: Q N are F, Examples

m (N is F)↔ N is mFnot (N is F)↔ N is not Fvery (N is F)↔ N is very Fmore of less (N is F)↔ N is more of less F

m(M is F and N is G)↔ (X, Y) is m (F× G)not (M is F and N is G)↔ (X, Y) is (F × G)′

↔ (M is not F )or(N is not G )

very (M is F and N is G)↔ (M is very F )and(N is very G )

more of less (M is F and N is G)↔ (M is more or less F) and(N is more or less G)

m (QN are F)↔ (m Q) N are Fnot (Q N are F)↔ (not Q) N are F

Page 211: Forging New Frontiers: Fuzzy Pioneers I

206 M. Nikravesh

Table 14 Possibilistic Relational Universal Fuzzy–Type IV, Pertaining to qualification, Rules oftype IV; q: p is ?, Example 1

y: truth value, a probability -value, a possibility value, . . .

for p: N is F and q be a truth - qualified value for pq: N is F is Twhere T is a liguistic truth - valueN is F is T ↔ N is G q: N is F is TT = μF (G) q: N is small is very trueμG (u) = μT (μF (u)) μSMALL = 1 − S(5, 10, 15)N is F→ ΠX = F μTRUE = 1 − S(0.6, 0.8, 1.0)Then N is F is T → ΠX = F+

μF+ (u) = μT (μF(u)) q→ �X(u) = S2(1− S(u; 5, 10, 15); (0.6, 0.8, 1.0))

if T = u - trueμu-true(v) = v v ∈ [0, 1]N is F is u - true→ N is F

Table 15 Possibilistic Relational Universal Fuzzy–Type IV, Pertaining to qualification, Rules oftype IV; q: p is ? Example 2.

Y : truth value, a probability - value, a possibility value,. . .for p: N is F and q be a truth - qualified value for pq: N is F is Twhere T is a liguistic truth - valueN is F is T ↔ N is G q: N is F is TT = μF(G) q: N is small is very trueμG(u) = μT (μF(u)) μSMALL = 1-S(5, 10, 15)N is F→ ΠX = F μTRUE = 1-S(0.6, 0.8, 1.0)Then N is F is T → ΠX = F+

μF+ (u) = μT (μF(u)) q→ �X(u) = S2(1− S(u; 5, 10, 15); (0.6, 0.8, 1.0))

If T = u- trueμu-true(v) = v v ∈ [0, 1]N is F is u-true→ N is F

Table 16 Possibilistic Relational Universal Fuzzy–Type IV, Pertaining to qualification, Rules oftype IV; q: p is ? , Example 3.

m (N is F is T )↔ N is F is m Tnot (N is F is T )↔ N is F is Not Tvery (N is F is T )↔ N is F is very Tmore or less (N is F is T )↔ N is F is more or less TN is not F is T ↔ N is F is ant T

ant: antonymfalse:ant true

N is very F is T ↔ N is F is 0.5Tμ0.5T (V ) = μT (V 2)

μaT (V ) = μT (V 1/a)

Page 212: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 207

Table 17 Possibilistic Relational Universal Fuzzy–Type IV, Pertaining to qualification, Rules oftype IV; q: p is ? , Example 4

“Barbara is not very rich”Semantically equivalent propositions:

Barbara is not very richBarbara is not very rich is u - trueBarbara is very rich is ant u - trueBarbara is rich is 0.5(ant u - true)

where μ0,5(ant u-true)(V ) = 1− V 2

and true approximately semantically = u − truethe proposition will be approximated to:“Barbara is rich is not very true”“Is Barbara rich? T ” is “not very true”.

based on possibilistic relational universal fuzzy–type I, II, III, and IV (pertaining tomodification, composition, quantification, and qualification) [11, 12, 13] and seriesof examples to clarify definitions.

2.1 Fuzzy Conceptual Match

The main problem with conventional information retrieval and search such as vectorspace representation of term-document vectors are that 1) there is no real theoreticalbasis for the assumption of a term and document space and 2) terms and documentsare not really orthogonal dimensions. These techniques are used more for visual-ization and most similarity measures work about the same regardless of model. Inaddition, terms are not independent of all other terms. With regards to probabilisticmodels, important indicators of relevance may not be term – though terms only areusually used. Regarding Boolean model, complex query syntax is often misunder-stood and problems of null output and Information overload exist. One solution tothese problems is to use extended Boolean model or fuzzy logic. In this case, onecan add a fuzzy quantifier to each term or concept. In addition, one can interpret theAND as fuzzy-MIN and OR as fuzzy-MAX functions. Alternatively, one can addagents in the user interface and assign certain tasks to them or use machine learn-ing to learn user behavior or preferences to improve performance. This techniqueis useful when past behavior is a useful predictor of the future and wide varietyof behaviors amongst users exist. In our perspective, we define this framework asFuzzy Conceptual Matching based on Human Mental Model. The Conceptual FuzzySearch (CFS) model will be used for intelligent information and knowledge retrievalthrough conceptual matching of both text and images (here defined as “Concept”).The selected query doesn’t need to match the decision criteria exactly, which givesthe system a more human-like behavior. The CFS can also be used for constructingfuzzy ontology or terms related to the context of search or query to resolve theambiguity. It is intended to combine the expert knowledge with soft computing tool.Expert knowledge needs to be partially converted into artificial intelligence that canbetter handle the huge information stream. In addition, sophisticated management

Page 213: Forging New Frontiers: Fuzzy Pioneers I

208 M. Nikravesh

work-flow need to be designed to make optimal use of this information. The newmodel can execute conceptual matching dealing with context-dependent word am-biguity and produce results in a format that permits the user to interact dynamicallyto customize and personalized its search strategy.

2.2 From Search Engine to Q/A and Questionnaire Systems

(Extracted text from Prof. Zadeh’s presentation and abstracts; Nikravesh et al., WebIntelligence: Conceptual-Based Model, Memorandum No. UCB/ERL M03/19, 5June 2003): “Construction of Q/A systems has a long history in AI. Interest in Q/Asystems peaked in the seventies and eighties, and began to decline when it becameobvious that the available tools were not adequate for construction of systems hav-ing significant question-answering capabilities. However, Q/A systems in the formof domain-restricted expert systems have proved to be of value, and are growing inversatility, visibility and importance. Upgrading a search engine to a Q/A system isa complex, effort-intensive, open-ended problem. Semantic Web and related systemsfor upgrading quality of search may be viewed as steps in this direction. But whatmay be argued, as is done in the following, is that existing tools, based as they areon bivalent logic and probability theory, have intrinsic limitations. The principalobstacle is the nature of world knowledge. Dealing with world knowledge needsnew tools. A new tool which is suggested for this purpose is the fuzzy-logic-basedmethod of computing with words and perceptions (CWP [1, 2, 3, 4]), with the un-derstanding that perceptions are described in a natural language [5, 6, 7, 8, 9].A concept which plays a key role in CWP is that of Precisiated Natural Language(PNL). It is this language that is the centerpiece of our approach to reasoning anddecision-making with world knowledge. The main thrust of the fuzzy-logic-basedapproach to question-answering which is outlined here, is that to achieve significantquestion-answering capability it is necessary to develop methods of dealing withthe reality that much of world knowledge—and especially knowledge about under-lying probabilities is perception-based. Dealing with perception-based informationis more complex and more effort-intensive than dealing with measurement-basedinformation. In this instance, as in many others, complexity is the price that has tobe paid to achieve superior performance.”

Once the TDC matrix (Table 1) has been transformed into maximally Z(n)-compact representation (Table 3) and Rules (Tables 4 and 5), one can use Z(n)-compact algorithm and transform the TDC into a decision-tree/hierarchical graphwhich represent a Q&A model as shown in Fig. 5.

2.3 NeuSearchTM

There are two types of search engine that we are interested and are dominatingthe Internet. First, the most popular search engines that are mainly for unstructured

Page 214: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 209

Q

Q2

Q1

Q3Q3

a1

a1

a2

a2a2

a12 a2

2

a31

a21a1

1

a13

a23 a1

3 a23

k13k1

3 k23

k22

k12

k21

k23

c1k3

1k1

1

c2

c1c2c2

Q & A and Deduction Model

Z(n)-Compact Clustering

c1D5,D8

D11, D12

D10 D9

D2,D3, D4,D6,D7, D8

D1

Fig. 5 Q & A model of TDC matrix and maximally Z(n)-compact rules and concept-based query-based clustering

data such as Google TM, Yahoo, MSN, and Teoma which are based on the con-cept of Authorities and Hubs. Second, search engines that are task spcifics such as1) Yahoo!: manually-pre-classified, 2) NorthernLight: Classification, 3) Vivisimo:Clustering, 4) Self-organizing Map: Clustering + Visualization and 5) AskJeeves:Natural Languages-Based Search; Human Expert. Google uses the PageRank andTeoma uses HITS for the Ranking. To develop such models, state-of-the-art compu-tational intelligence techniques are needed [10, 11, 12, 13, 14, 15]. Figures 5 through8 show, how neuro-science and PNL can be used to develop the next generation ofthe search engine.

Figure 6 shows a unified framework for the development of a search engine basedon conceptual semantic indexing. This model will be used to develop the NeuSearchmodel based on the Neuroscience and PNL approach. As explained previously andrepresented by Figs. 2 through 4 with respect to the development of FCM, the firststep will be to represent the term-document matrix. Tf-idf is the starting point forour model. As explained earlier, there are several ways to create the tf-idf matrix [7].One such attractive idea is to use the term ranking algorithm based on evolution-ary computing, as it is presented in Fig. 6, GA-GP context-based tf-idf or rankedtf-idf.

Once fuzzy tf-idf (term-document matrix) is created, the next step will be to usesuch indexing to develop the search and information retrieval mechanism. As shownin Fig. 6, there are two alternatives 1) classical search models such as LSI-basedapproach and 2) NeuSearch model. The LSI based models include 1) Probabilitybased-LSI, Bayesian-LSI, Fuzzy-LSI, and NNNet-based-LSI models. It is interest-ing that one can find an alternative to each of such LSI-based models using RadialBasis Function (RBF). For example, one can use Probabilistic RBF equivalent to

Page 215: Forging New Frontiers: Fuzzy Pioneers I

210 M. Nikravesh

no desaB SCFhcaorppA ecneicsorueN

]1 ,0 [ ]fdi-ft [

]tes[

xirtaM tnemucoD-mreT

yroehT teS yzzuF fo esu ehT yroehT teS yzzuF fo esu ehT

yroehT citsilibaborP-lacitsitats fo esu ehTyroehT cigol-tnelavib fo esu ehT

yroehT desaB-tcejbO-teS yzzuF fo esu ehT yroehT desaB-tcejbO-teS yzzuF fo esu ehT

noitazilaicepS

hcraeS esicerpmI hcraeS esicerpmI

.cte ,socyL .cte ,socyL

PG-AG desaB-txetnoC

deknaR ;fdi-ftfdi-ft

PG-AG desaB-txetnoC

deknaR ;fdi-ftfdi-ft

,eltiT ,cipoT ,eltiT ,cipoT

noitazirammuS noitazirammuS

desaB-tpecnoCgnixednI

desaB-tpecnoCgnixednI

;hcraes drowyeK ;seuqinhcet lacissalc

.cte ,amoeT ,elgooG

yroehT hparG esU .teN citnameS dna

PG-AG htiw PLN ylbissoP ;PLN desaB

.seveeJksA

yroehT hparG esU .teN citnameS dna

PG-AG htiw PLN ylbissoP ;PLN desaB

.seveeJksA

SCFueN

FBR

FBRP

FBRRG

SIFNA

NNFBR

)MVS ,PG-AG ,PB(

ytilibaborP

naiseyaB

yzzuF

tenNN

)MVS ,PG-AG ,PB(

ISL

no desaB SCFhcaorppA ecneicsorueN

hcraeS lacissalC

NeuSearch : Neuroscience ApproachSearch Engine Based on Conceptual Semantic Indexing

w (i,j)

w (i,j)

lautpecnoC yzzuF-orueN)SCFueN( hcraeS

Fig. 6 Search engine based on conceptual semantic indexing

(),(),,(),( kpjpkjpfkjw

Documents Space or

Conc ep t and Context Space

Ba se d on SOM or PCA

Concept-Context Dependent Word Space

,(),( jipfjiwi: neuron in

Concept-Context layer

j: neuron in document orConcept-Context layerk: neuron in word layer

Document

(Corpus)

Neu-FCS

)

Word Space

),(), ip word layer

j: neuron in document or

)( jp

W(i,j) is calculated based onFuzzy-LSI or Probabilistic LSI

(In general form, it can beCalculated based on PNL)

Fig. 7 Neuro-Fuzzy Conceptual Search (NeuFCS)

Page 216: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 211

Interconnection based onMutual Information

i: neuron in document layer

j: neuron in word layer

rij:i is an subset of ji is an superset of j

i is an instance of j (is or isu)(is or isu)(is or isu)

j is an attribute of ii causes j (or usually)i and j are related

ijrw(i, j)

Word Space

Concept-Context Dependent Word Space

i: neuron in word layer

j: neuron in document or Concept-Contextlayer

j: neuron in document or Concept-Contextlayer

k: neuron in wordlayer

Document(Corpus)

w(i, j) = f (p(i, j), p(i), p(j))

w(j, k) = f (p(j, k), p(j), p(k))

Fig. 8 PNL-based conceptual fuzzy search using brain science

Probability based-LSI, Generalized RBF as an equivalent to Bayesian-LSI, ANFISas an equivalent to Fuzzy-LSI, and RBF function neural network (RBFNN) as anequivalent to NNnet-LSI. RBF model is the basis of the NeuSearch model (Neuro-Fuzzy Conceptual Search– NeuFCS). Given the NeuSearch model, one needs tocalculate the w(i,j) which will be defined in next section (the network weights).Depends on the model and framework used (Probability, Bayesian, Fuzzy, or NNnetmodel), the interpretation of the weights will be different. Fig. 7 show the typicalNeuro-Fuzzy Conceptual Search (NeuFCS) model input-output. The main idea withrespect to NeuSearch is concept-context-based search. For example, one can searchfor word “Java” in the context of “Computer” or in the context of “Coffee”. One canalso search for “apple” in the context of “Fruit” or in the context of “Computer”.Therefore, before we relate the terms to the document, we first extend the keyword,given the existing concepts-contexts and then we relate that term to the documents.Therefore, there will be always a two-step process, based on NeuSearch model asshown below:

Original keywords Extended keyword

Concept-Context Nodes(RBF Nodes)

Wi,j Wj,k

Original Documents Extended Documents

Concept-Context Nodes (RBF Nodes)

W’i,j W’j,k

In general, w(i,j) is a function of p(i,j), p(i), and p(j), where p represent the prob-ability. Therefore, one can use the probabilistic-LSI or PRBF given the NeuSearch

Page 217: Forging New Frontiers: Fuzzy Pioneers I

212 M. Nikravesh

• .D.hP,rac,.g.e(tpecnoc,tcurtsnoc,tcejbo:)enixel(i)eerged

• )desab-noitpecrepyltsom(ituobaegdelwonkdlrow:)i(K• Rsnoitaler)i(notnidezinagrosi)i(K ii R,…, ni

• Rniseirtne ji ifosetubirttadeulav-noitubirtsid-ladomibera• ni,erasetubirttafoseulav -txetnocdnaralunarg,lareneg

tnedneped

sknildnasedonfokrowten

w ji fohtgnertsralunarg=dnaineewtebnoitaicossa

j enixel i

r ji )usirosi(jfoecnatsninasii:)usirosi(jfotesbusasii)usirosi(jfotesrepusasii

ifoetubirttanasij)yllausuro(jsesuaci

detalererajdnai

r ji

i

jrij

)i(Kenixel

enixel j

wij

Fig. 9 Organization of the world knowledge; Epistemic (Knowledge directed) and Lexicon(Ontology related)

framework. If the probabilities are not known, which is often times is the case, onecan use Fuzzy-LSI model or ANFIS model or NNnet-LSI or RBFNN model usingthe NeuSearch model. In general, the PNL model can be used as unified frameworkas shown in Figs. 8 through 10. Figures 8 through 10 show PNL-Based ConceptualFuzzy Search Using Brain Science model and concept presented based on Figs. 2through 7.

•standard constraint: X C•generalized constraint: X isr R

copula

GC-form (generalized constraint form of type r)

type identifier

constraining relation

constrained variable

•X= (X1 , …, Xn )•X may have a structure: X=Location (Residence(Carol))•X may be a function of another variable: X=f(Y)•X may be conditioned: (X/Y)• ...//////////...//: psfgrsupvblankr =

r: rs random set constraint; X isrs R; R is the set-valued probability distribution of X

r: fg fuzzy graph constraint; X isfg R; X is a function and R is its fuzzy graph

r: u usuality constraint; X isu R means usually (X is R)

r: ps Pawlak set constraint: X isps ( X, X) means that X is a set and X and X are the lower and upper approximations to X

X isr R

Fig. 10 Generalized Constraint

Page 218: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 213

Based on PNL approach, w(i,j) is defined based on ri,j as follows:

Wi,j

i jri,j

Where wi,j is granular strength of association between i and j, ri,j is epistemiclexicon, wi,j <== ri,j, and ri,j is defined as follows:

rij: i is an instance of j (is or isu)i is a subset of j (is or isu)i is a superset of j (is or isu)j is an attribute of ii causes j (or usually)i and j are related

Figure 11 shows a typical example of Neu-FCS result. Both term-document ma-trix and term-term document matrix/document-document matrix are reconstructed.For example, the original document corpus is very sparse as one can see in the figure;however, after processing through concept node, the document corpus is less spareand smoothens. Also, the original keywords are expanded given the concept-contextnodes. While the nodes in the hidden space (Concept-Context nodes) are based onRBF (NueSearch model) and it is one dimensional nodes, one can use SOM to

Activated Document orConcept-Context

Word Space

Input: Word

Neu-FCSOutput: Concept-Context Dependent Word

Document(Corpus)

Fig. 11 Typical example of Neu-FCS

Page 219: Forging New Frontiers: Fuzzy Pioneers I

214 M. Nikravesh

rearrange the nodes given the similarity of the concept (in this case the RBF in eachnode) and present the 1-D nodes into 2-D model as shown in Fig. 11.

3 Conclusions

Intelligent search engines with growing complexity and technological challenges arecurrently being developed. This requires new technology in terms of understanding,development, engineering design and visualization. While the technological exper-tise of each component becomes increasingly complex, there is a need for betterintegration of each component into a global model adequately capturing the im-precision and deduction capabilities. In addition, intelligent models can mine theInternet to conceptually match and rank homepages based on predefined linguisticformulations and rules defined by experts or based on a set of known homepages.The FCM model can be used as a framework for intelligent information and knowl-edge retrieval through conceptual matching of both text and images (here definedas “Concept”). The FCM can also be used for constructing fuzzy ontology or termsrelated to the context of the query and search to resolve the ambiguity. This modelcan be used to calculate conceptually the degree of match to the object or query.

Acknowledgments Funding for this research was provided by the British Telecommunication(BT), the BISC Program of UC Berkeley, and Imaging and Informatics group at Lawrence Berke-ley National Lab. The author would like to thanks Prof. Zadeh for his feedback, comments andallowing the authors to use the published and his unpublished documents, papers, and presentationsto prepare this paper.

References

1. L. A. Zadeh, From Computing with Numbers to Computing with Words – From Manipulationof Measurements to Manipulation of Perceptions, IEEE Transactions on Circuits and Systems,45, 105–119, 1999.

2. L. A. Zadeh, “A new direction in AI: Towards a Computational Theory of Perceptions,” AImagazine, vol. 22, pp. 73–84, 2001.

3. L.A. Zadeh, Toward a Perception-based Theory of Probabilistic Reasoning with ImpreciseProbabilities, Journal of Statistical Planning and Inference, 105 233–264, 2002.

4. L. A. Zadeh and M. Nikravesh, Perception-Based Intelligent Decision Systems; Office ofNaval Research, Summer 2002 Program Review, Covel Commons, University of California,Los Angeles, July 30th–August 1st, 2002. (L.A. PRUF-a meaning representation language fornatural languages, Int. J. Man-Machine Studies 10, 395–460, 1978.)

5. M. Nikravesh and B. Azvine; New Directions in Enhancing the Power of the Internet, Proc. Ofthe 2001 BISC Int. Workshop, University of California, Berkeley, Report: UCB/ERL M01/28,August 2001.

6. V. Loia , M. Nikravesh, L. A. Zadeh, Journal of Soft Computing, Special Issue; fuzzy Logicand the Internet, Springer Verlag, Vol. 6, No. 5; August 2002.

7. M. Nikravesh, et. al, “Enhancing the Power of the Internet”, Volume 139, published in theSeries Studies in Fuzziness and Soft Computing, Physica-Verlag, Springer (2004).

Page 220: Forging New Frontiers: Fuzzy Pioneers I

Concept-Based Search and Questionnaire Systems 215

8. M. Nikravesh, Fuzzy Logic and Internet: Perception Based Information Processing and Re-trieval, Berkeley Initiative in Soft Computing, Report No. 2001-2-SI-BT, September 2001a.

9. M. Nikravesh, BISC and The New Millennium, Perception-based Information Processing,Berkeley Initiative in Soft Computing, Report No. 2001-1-SI, September 2001b.

10. M. Nikravesh, V. Loia„ and B. Azvine, Fuzzy logic and the Internet (FLINT), Internet, WorldWide Web, and Search Engines, International Journal of Soft Computing-Special Issue infuzzy logic and the Internet , 2002

11. M. Nikravesh, Fuzzy Conceptual-Based Search Engine using Conceptual Semantic Indexing,NAFIPS-FLINT 2002, June 27–29, New Orleans, LA, USA

12. M. Nikravesh and B. Azvin, Fuzzy Queries, Search, and Decision Support System, Interna-tional Journal of Soft Computing-Special Issue in fuzzy logic and the Internet , 2002

13. M. Nikravesh, V. Loia, and B. Azvine, Fuzzy logic and the Internet (FLINT), Internet, WorldWide Web, and Search Engines, International Journal of Soft Computing-Special Issue infuzzy logic and the Internet, 2002

14. M. Nikravesh, Fuzzy Conceptual-Based Search Engine using Conceptual Semantic Indexing,NAFIPS-FLINT 2002, June 27–29, New Orleans, LA, USA

15. V. Loia, M. Nikravesh and Lotfi A. Zadeh, “Fuzzy Logic and the Internet”, Volume 137,published in the Series Studies in Fuzziness and Soft Computing, Physica-Verlag, Springer(2004)

Page 221: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time SeriesData Mining

Ildar Z. Batyrshin and Leonid Sheremetov

Abstract Human decision making procedures in problems related with analysisof time series data bases (TSDB) often use perceptions like “several days”, “highprice”, “quickly increasing” etc. Computing with Words and Perceptions can beused to formalize perception based expert knowledge and inference mechanisms de-fined on numerical domains of TSDB. For extraction from TSDB perception basedinformation relevant to decision making problems it is necessary to develop methodsof perception based time series data mining (PTSDM). The paper considers differentapproaches used in analysis of time series databases for description of perceptionbased patterns and discusses some methods of PTSDM.

1 Introduction

Human decision making procedures in problems related with analysis of time seriesdata bases (TSDB) in economics, finance, meteorology, medicine, geophysics etc.often use perceptions like several days, near future, high price, very quickly in-creasing, slowly oscillating, new companies, highly associated behavior, very prob-ably etc. defined on a time domain, on a range of time series values, on the setsof time series, attributes, system elements, on a set of possibility or probabilityvalues etc. Computing with Words and Perceptions (CWP) [38] containing fuzzylogic as a main constituent can serve as a bridge between linguistic information inexpert knowledge and numerical information given in TSDB. Inference proceduresof CWP can be used for modeling human perception based reasoning mechanisms.For extraction from TSDB perception based information relevant to decision mak-ing problems it is necessary to develop methods of perception based time seriesdata mining (PTSDM). Fig. 1 shows a possible architecture of intelligent decisionmaking system integrating PTSDM, CWP and expert knowledge used in decisionmaking procedures in problems related with analysis of TSDB.

Ildar Z. Batyrshin · Leonid SheremetovResearch Program in Applied Mathematics and Computing (PIMAyC), Mexican Petroleum Insti-tute, Eje Central Lazaro Cardenas 152, Col. San Bartolo Atepehuacan, C.P. 07730, Mexico, D.F.,Mexico, e-mail: {batyr,[email protected]}

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 217C© Springer 2007

Page 222: Forging New Frontiers: Fuzzy Pioneers I

218 I. Z. Batyrshin, L. Sheremetov

Fig. 1 Architecture ofintelligent decision makingsystem based on expertknowledge in time series database domains

Time Series Data Base

Intelligent Decision Making System

Perception Based Time Series Data Mining

Computing with Words and Perceptions

Expert Knowledge

Many economic, financial, technological and natural systems described by TSDBare very complex and often it is impossible to take into account all information influ-encing on the solution of the decision making problems related with the analysis ofthem. To such systems the Principle of Incompatibility of Zadeh can be applied [37]:“As the complexity of a system increases, our ability to make precise and yet signif-icant statements about its behavior diminishes until a threshold is reached beyondwhich precision and significance (or relevance) become almost mutually exclusivecharacteristics”. For this reason, often only qualitative, perception-based solutionshave sense in decision making in complex systems.

Realization of intelligent system supporting perception based decision makingprocedures in the problems related with the analysis of time series data bases needsto extend the methods of time series data mining (TSDM) to give them possibilityto operate with perceptions. The goal of data mining is the analysis of (often large)observational data sets to find unsuspected relationships and to summarize the datain novel ways that are both understandable and useful to the data owner [18]. Thefollowing list contains the main time series data mining tasks [23, 1, 3, 12, 13, 16,19, 27, 28, 36].

Segmentation: Split a time series into a number of “meaningful” segments.Possible representations of segments include approximating line, perceptualpattern, word, etc.

Clustering: Find natural groupings of time series or time series patterns.Classification: Assign given time series or time series patterns to one of several

predefined classes.Indexing: Realize efficient execution of queries.Summarization: Give short description of a time series (or multivariate time

series) which retains its essential features in considered problem.Anomaly Detection: Find surprising, unexpected patterns.Motif Discovery: Find frequently occurring patterns.Forecasting: Forecast time series values based on time series history or human

expertise.Discovery of association rules: Find rules relating patterns in time series (e.g.

patterns that occur frequently in the same or in neighboring time segments).

Page 223: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 219

These tasks are mutually related, for example, segmentation can be used for in-dexing, clustering, summarization, etc.

Perception-based time series data mining systems should be able to manipulatelinguistic information, fuzzy concepts and perception based patterns of time series tosupport human decision making in problems related with time series data bases. For-tunately, a number of methods in time series data mining were recently developedfor manipulating such information. The types of perceptions which can be definedon domains of TSDB and a short survey of papers operating by perception basedshape patterns are considered in Sect. 2. Usually the patterns used in many works arecrisp but they may be generalized to represent fuzzy patterns. In Sects. 3 we considernew methods of parameterization of perception based shape patterns and applicationof fuzzy perception based functions to modeling qualitative forecast of new productsales. Conclusions contain discussion of future directions of research in PTSDM.

2 Perception Based Patterns in Time Series Data Bases

Development of methods of PTSDM needs to formalize human perceptions abouttime, time series values, patterns and shapes, about associations between patternsand time series, etc. These perceptions can be represented by words whose meaningis defined on the following domains of time series data bases:

• time domain, time intervals (one-two weeks, several days, end of the day), ab-solute or relative position on time scale (in June 2006, near future), periodic orseasonal time intervals (end of the day, several weeks before Christmas);

• range of TS values (high price; very low level of production);• as a perception based function or pattern of TS shape (slowly decreasing, quickly

increasing and slightly concave);• on the set of time series, attributes or system elements (stocks of new companies);• on the set of relations between TS, attributes or elements (highly associated);• on the set of possibility or probability values (unlikely, very probable).

Most of such perceptions can be represented as fuzzy sets defined on correspond-ing domain [24]. An exact definition of membership values of fuzzy sets used inmodels of CWP often is not very important when input and output of models arewords [38], which are usually insensible to some change in membership values offuzzy sets representing them. This situation differs from Mamdani or Sugeno fuzzymodels where initial definition of fuzzy sets usually does not play an important rolein construction of final fuzzy model when a tuning of membership functions is usedin the presence of training input-output data [20].

Different techniques used for description of time series shape patterns can beconventionally classified as follows:

• analysis of signs of first and second derivatives of time series variable,• scaling of trends and convex-concave patterns,

Page 224: Forging New Frontiers: Fuzzy Pioneers I

220 I. Z. Batyrshin, L. Sheremetov

• parameterization of patterns,• shape definition language,• clustering and linguistic interpretation of cluster shape pattern,• analysis of temporal relations between shape patterns,• analysis of shape patterns given in expert knowledge, summaries and forecasting

texts.

Triangular episodes representation language was formulated in [15] for represen-tation and extraction of temporal patterns. These episodes defined by the signs ofthe first and second derivatives of time dependent function can be linguisticallydescribed as A: Increasing and Concave; B: Decreasing and Concave; C: Decreas-ing and Convex; D: Increasing and Convex; E: Linearly Increasing; F: LinearlyDecreasing; G: Constant. These episodes can be used for coding time series andtime series patterns as a sequence of symbols like ABCDABCDAB. Such codedrepresentation of time series can be used for dimensionality reduction, indexing,clustering of time series, etc. The possible applications of such representation areprocess monitoring, diagnosis and control.

The more extended dictionary of temporal patterns defined by the signs of thefirst and the second derivatives in the pattern is considered in [25]. This dictio-nary includes the perceptual patterns: PassedOverMaximum; IncreasingConcavely;StartedToIncrease; ConvexMaximum etc. In considered approach the profile of mea-sured variable x j (t) is transformed into the qualitative form as a result of approxi-mation of x j (t) by a proper analytical function from which the signs of derivativesare extracted. The paper [25] introduces a method for reasoning about the form ofthe recent temporal profiles of process variables which carry important informationabout the process state and underlying process phenomena. Below is the example ofthe shape analyzing rule used by decision making system in control of fermentationprocesses [25]:

IF (DuringTheLast1hr Dissolved Oxygen has been DecreasingConcavely-Convexly) THEN (Report: Foaming) and (Feed antifoam agent).

A scaling of perception based patterns is used in many papers. This scaling canbe applied to time series values, to slope values, to convex-concave shapes etc.The list of primitives Upslope, Large-Pos, Large-Neg, Med-Pos, Med-Neg, TrailingEdge, Cup, Cap was used for description of Carotid waveforms [32]. The sequenceof primitives can be used in syntactic pattern recognition of systolic and diastolicepochs. In [11], the possibility to introduce fuzziness into syntactic descriptionsof digital signals is discussed. Fuzzy boundaries between the systolic and dias-tolic regimes and between primitives can be represented by transition membershipfunctions.

Scaling of slope values of functional dependencies and time series is used in asystem [9] that generates linguistic descriptions of time series in the form of rules

If T is Tk then Y is Ak,

where Tk are fuzzy intervals, like Between A and B, Small, Large and Ak are lin-guistic descriptions of trends, like Very Quickly Increasing, Slowly Decreasing, etc.

Page 225: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 221

An evolutionary procedure is used to find optimal partition of time domain on fuzzyintervals where time series values are approximated by linear functions. The paperdiscusses the methods of retranslation of obtained piece-wise linear approximationof time series into linguistic form.

Similar linguistic descriptions of time series are reported in [13] where a scalingof trends in the system is used that detects and describes linguistically significanttrends in time-series data, applying wavelets and scale space theory. In that worksome experimental results of application of this system to summarization of weatherdata are presented. Below is an example of a description generated automatically bythe system, which used perception based patterns like dropped, dropped sharply,etc:

“The temperature dropped sharply between the 3rd and the 6th. Then it rosebetween the 6th and the 11th, dropped sharply between the 14th and 16th anddropped between the 23rd and 28th.”

Another approach to description of time series by rules If T is Tk then Y is Ak ,of considered above type was proposed in [7]. The method is based on MovingApproximation (MAP) transform [8, 6], which replaces time series by a sequence ofslope values of linear functions approximating time series values in sliding window.Perceptions like “Quickly Increasing”, “Very Slowly Decreasing” etc can be easilydefined on the set of slope values generated by the MAP [7]. MAP can be usedfor analysis of TS, e.g. in economics and finance, where the analysis of trends andtendencies in TS values plays an important role. MAP gives a basis for definition oftrend association measures for evaluation of associations between time series andtime series patterns [6]. Unlike the known similarity measures used in TSDM thenew measures are invariant under linear transformations of TS. For this reason theycan be used as regular measures of similarity between TS and TS patterns in mosttasks of TSDM.

Perception based conditions can be used in the analysis of associations betweenelements of systems described by multivariate time series, e.g. system of petroleumwells given by time series of oil and gas production, system of currencies given bytime series of exchange rate etc. Generation of association rules in the form

IF PBC then A is highly associated with B,

where PBC is perception based condition like High level of oil production in wintermonths in well number 25 was considered in [5] for the analysis of associationsbetween oil and gas production of wells from some oilfield. Such associations canbe compared with spatial and other relations existing between system elements.

The paper [3] presents an approach to modeling time series datasets using lin-guistic shape descriptors represented by parametric functions. A linguistic term“rising” indicates that the series at this point is changing, i.e. yk+1 > yk . A morecomplex term such as “rising more steeply” indicates that the trend of the series ischanging, i.e. yk+2 − yk+1 > yk+1 − yk . These terms are measures of the first andthe second derivatives of the series, respectively. Parametric prototype trend shapesare given by the following functions:

Page 226: Forging New Frontiers: Fuzzy Pioneers I

222 I. Z. Batyrshin, L. Sheremetov

f f ls (t) = 1−√

1− (1− t)α, frms (t) = 1−√1− tα, f f ms(t) =√

1− tα,

frls (t) =√

1− (1− t)α, fcr (t) = 1− α2(t − 0.5)α,

ftr (t) = α2(t − 0.5)α,

representing perception based shape patterns falling less steeply, falling more steeply,rising more steeply, rising less steeply, crest, trough, respectively. Each part of timeseries can have membership in several trend shapes. The trend concepts are used tobuild fuzzy rules in the form [3]:

If trend is F then next point is Y; If trend is F then next point is currentpoint+dY,

where F is a trend fuzzy set, such as “rising more steeply” and Y, dY are fuzzytime series values. Prediction using these trend fuzzy sets is performed using theFril evidential logic rule. Approach uses also fuzzy scaling of trends with linguisticpatterns: falling fast, falling slowly, constant, rising slowly, and rising fast.

A Shape Definition Language (SDL) was developed in [1] for retrieving objectsbased on shapes contained in the histories associated with these objects. For ex-ample, in a stock database, the histories of opening price, closing price, the high-est for the day, the lowest for the day, and the trading volume may be associatedwith each stock. SDL allows a variety of queries about the shapes of histories. Itperforms “blurry” matching [1] where the user cares about the overall shape butdoes not care about specific details. SDL has efficient implementation based onindex structure for speeding up the execution of SDL queries. The alphabet of SDLcontains a set of scaled patterns like slightly increasing transition, highly increasingtransition, slightly decreasing transition, highly decreasing transition denoted asup, Up, down, Down respectively.This alphabet can be used for definition of shapesas follows: (shape name(parameters) descriptor). For example, “a spike” can bedefined as (shape spike() (concat Up up down Down)), where the list of parame-ters is empty and concat denotes concatenation. Complex shapes can be derived byrecursive combination of elementary and previously defined shapes. The approachgives the possibility to retrieve combinations of several shapes in different historiesby using the logical operators and and or.

Clustering and linguistic interpretation of shape patterns is considered in [16].This paper studies the problem of finding rules, relating patterns in a time series toother patterns in that series, or patterns in one series to patterns in another series.The patterns are formed from data. The method first forms subsequences by slidinga window through the time series, and then clusters these subsequences by using asuitable measure of time series similarity. Further the rule finding methods are used

to obtain the rules from the sequences. The simplest rules have format AT⇒ B , i.e.

if A occurs, then B occurs within time T,

where A and B are identifiers of patterns (clusters of patterns) discovered in timeseries. The approach was applied to several data sets. As an example, from thedaily closing share prices of 10 database companies traded on the NASDAG the

Page 227: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 223

following significant rule was found: s1820⇒ s4. The patterns s18 and s4 represent

some clusters of patterns. An interpretation of the rule can have a form:

“a stock which follows a 2.5-week declining pattern of s18 sharper decreaseand then leveling out, will likely incur a short sharp fall within 4 weeks (20days) before leveling out again (the shape of s4)”.

Temporal relations between shape patterns are considered in [19]. The approachto knowledge discovery from multivariate time series usees segmentation of timeseries and transformation into sequences of state intervals (bi , si , fi ), i=1, . . . n.Here, si are time series states like increasing, decreasing, constant, highly increas-ing, convex holding during time periods [bi , fi ). The temporal relationships betweenstate intervals are described by thirteen temporal relationships of Allen’s intervallogic [2]. Finally, the rules with frequent temporal patterns in the premise and con-clusion are derived. The method was applied to time series of air-pressure and windstrength/wind direction data [19]. The smoothed time series have been partitionedinto segments with primitive patterns like very highly increasing, constant, decreas-ing. Below is an example of association rule generated by the proposed approach:

convex air pressure, highly decreasing air pressure, decreasing air pressure→highly increasing wind strength,

where the patterns in the antecedent part of the rule are mutually related by Alentemporal relationships like overlaps, equals, meets. The meaningful rules obtainedby the described technique can be used together with expert knowledge in construc-tion of expert system [19].

Several approaches use analysis of time series shape patterns given in expertknowledge, summaries and forecasting texts for transforming them into rule basedfuzzy expert system or for texts generation similar to human descriptions in con-sidered problem area. A rule-based fuzzy expert system WXSYS realized in Fuzzy-Clips attempts to predict local weather based on conventional wisdom [29]. Beloware examples of expert rules which are used in the system in a formalized form:

“Generally, if the barometer falls steadily and the wind comes from an easterlyquarter, expect foul weather”.

“If the current wind is blowing from S to SW and the current barometricpressure is rising from 30.00 or below, then the weather will be clearing withina few hours, then fair for several days”.

System called Ana [26] uses special grammar for generation of summaries basedon patterns retrieved from the summaries generated by human experts. Ana gen-erates descriptions of stock market behavior. Data from a Dow Jones stock quotesdatabase serves as an input to the system, and the opening paragraphs of a stockmarket summary are produced as an output. The following text sample is one of thepossible interpretations of data generated by the system:

Page 228: Forging New Frontiers: Fuzzy Pioneers I

224 I. Z. Batyrshin, L. Sheremetov

“Wall Street’s securities markets rose steadily through most of the morning,before sliding downhill late in the day. The stock market posted a small lossyesterday, with the indexes finishing with mixed results in active trading. TheDow Jones average of 30 industrials surrendered a 16.28 gain at 4pm anddeclined slightly, to finish at 1083.61, off 0.18 points”.

The more extended system called StockReporter is discusses in [33]. This systemis one of a few online text generation systems, which generate textual descriptionsof numeric data sets [36]. StockReporter produces reports that incorporate both textand graphics. It reports on the behavior of any one of 100 US stocks and on howthat stock’s behavior compares with the overall behavior of the Dow Jones Index orthe NASDAQ. StockReporter can generate a text like the following:

“Microsoft avoided the downwards trend of the Dow Jones average today.Confined trading by all investors occurred today. After shooting to a high of$104.87, its highest price so far for the month of April, Microsoft stock easedto finish at an enormous $104.37”.

Finally we cite some perception based weather forecast generated by weather.com:

“Scattered thunderstorms ending during the evening”, “Skies will becomepartly cloudy after midnight”, “Occasional showers possible”.

Considered examples of generated texts with perception based terms support oneof the main ideas of Computing with Words and Perceptions that inputs and/oroutputs of real decision making systems can contain perceptions described bywords [38].

3 Perception Based Functions in Qualitative Forecasting

In this section we discuss the methods of modeling of qualitative expert forecastingby perception based functions (PBF) [4]. Qualitative forecasting methods use theopinions of experts to subjectively predict future events [12]. These methods areusually used when historical data either are not available or are scarce, for exampleto forecast sales for the new product. In subjective curve fitting applied to predictingsales of a new product the product life cycle is usually thought of as consisting ofseveral stages: “growth”, “maturity” and “decline”. Each stage is represented byqualitative patterns of sales as follows [12]:

“Growth” stage: Start Slowly, then Increase Rapidly, and then Continue toIncrease at a Slower Rate;

“Maturity” stage, sales of the product stabilize: Increasing Slowly, Reach-ing a Plateau, and then Decreasing Slowly;

“Decline” stage: Decline at an Increasing Rate.

Page 229: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 225

Fig. 2 Product Life Cycle.Adopted from [12]

The “growth” stage is subjectively represented as S-curve, which could then beused to forecast sales during this stage (see Fig. 2). To predict time intervals for eachstep of “Growth” stage, the company uses the expert knowledge and its experiencewith other products. The subjective curve fitting is very difficult and requires a greatdeal of expertise and judgment [12].

The methods of reconstruction of perception based functions [4, 10] can supportthe process of qualitative forecasting. The qualitative patterns of sales may be rep-resented by perception based patterns of trends and each stage may be representedby the sequence of fuzzy time intervals with different trend patterns. For example,S-curve modeling “Growth” stage can be represented by the following rules:

R1: If T is Start of Growth Stage then V is Slowly Increasing and Convex;

R2: If T is Middle of Growth Stage then V is Quickly Increasing;

R3: If T is End of Growth Stage then V is Slowly Increasing and Concave;

where T denotes time intervals and V denotes the sale volumes. Perception basedfunctions use scaling of trend patterns and fuzzy granulation of scale grades.

Below is an example of a linguistic scale of directions given by Increasing-Decreasing patterns: L D =<Extremely Quickly Decreasing, Quickly Decreasing,Decreasing, Slowly Decreasing, Constant, Slowly Increasing, Increasing, QuicklyIncreasing, Extremely Quickly Increasing>. In abbreviated form this scale will bewritten as follows: L D =<1:EQD, 2:QDE, 3:DEC, 4:SDE, 5:CON, 6:SIN, 7:INC,8:QIN, 9:EQI>. The granulation of this scale may be done by suitable crisp or fuzzypartition of the range of slope values of MAP transform of time series. Figure 3)depicts possible axis of fuzzy directions corresponding to grades of this scale.

The scale of concave-convex patterns can have the following grades: LCC =<Strongly Concave, Concave, Slightly Concave, Linear, Slightly Convex, Con-vex, Strongly Convex>. The grades of these scales can be used for generation oflinguistic descriptions like “Quickly Increasing and Slightly Concave”, “SlowlyDecreasing and Strongly Convex” etc. Such perception based descriptions maybe represented as suitable convex-concave modifications of corresponding axis of

Page 230: Forging New Frontiers: Fuzzy Pioneers I

226 I. Z. Batyrshin, L. Sheremetov

Fig. 3 Axis of directions

directions and by further fuzzification of obtained curves. Several methods of con-vex (CV) and concave (CC) modifications of directions were proposed in [10].

BZ-modifications are based on Zadeh operation of contrast intensification:

CC(y) = y2 − (y2 − y)2

y2 − y1, CV (y) = y1 + (y − y1)

2

y2 − y1,

where y is a modified function and y1, y2 are the minimal and the maximal val-ues of y(x) on considered interval of input variable x . The grades of LCC scaleare represented by the following patterns, respectively: PCC = <CC(CC(CC(y))),CC(CC(y)), CC(y), y, CV(y), CV(CV(y)), CV(CV(CV(y)))>. Figure 4a) depicts thesepatterns applied to directions 7.INC and 4.SDE. For example, the pattern Slowly De-creasing and Strongly Convex is represented by the undermost curve in Fig. 4a) andcalculated by f=CV(CV(CV(y))) where y is the line corresponding to the direction4:SDE.

BY-modification uses the following CC-CV modifications of linear function y:

CCt (y) = y2 − t√

(y2 − y1)t − (y − y1)

t ,

CVt (y) = y1 + t√

(y2 − y1)t − (y2 − y)t ,

where t is a parameter, t ∈ (0, 1]. We have: CC1 = CV1 = I , where I (y) = y.Figure 4b) shows CC-CV patterns in directions 7:INC and 4:SDE corresponding toall grades of LCC scale and obtained by BY-modifications <CC0.4, CC0.6, CC0.8,I , CV0.8, CV0.6, CV0.4> respectively.

Page 231: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 227

-5 0 516

18

20

22

24

26

28

30CCM = 1, dir = 7:INC

dir = 4:SDE

-5 0 516

18

20

22

24

26

28

30CCM = 2, dir = 7:INC

dir = 4:SDE

-5 0 516

18

20

22

24

26

28

30CCM = 3, dir = 7:INC

dir = 4:SDE

a) c)b)

Fig. 4 Granulation of convex-concave patterns in directions 7:INC (dashed line) and 4:SDE (dot-ted line) obtained by a) BZ- modification; b) BY-modification, and c) BS-modification

BS-modification applies the following CC-CV modifications of linear function y:

CCs(y) = y1 + (y2 − y1)(y − y1)

(y2 − y1)+ s(y2 − y),

CVs(y) = y2 − (y2 − y1)(y2 − y)

(y2 − y1)+ s(y − y1),

where s ∈ (−1, 0]. We have CC0 = CV0 = I . Fig. 4c) shows CC-CV patternsobtained by BS-modification of directions 7:INC and 4:SDE corresponding to allgrades of LCC scale as follows: <CC−0.95, CC−0.8, CC−0.5, I , CV−0.5, CV−0.8,CV−0.95> respectively.

The linear and convex-concave patterns may be used for crisp and fuzzy model-ing of perception based time series patterns. Fuzzy modeling uses fuzzification oftrend patterns [4]. This fuzzification may be defined parametrically depending onthe type and parameters of a fuzzy set used for fuzzification. The possible recon-struction of rules R1-R3 considered above is shown in Fig. 5. The resulting fuzzyfunction is obtained as a concatenation of fuzzy perception based patterns definedon the sequence of fuzzy intervals given in the antecedents of the rules R1-R3. Thewidth of this fuzzy function corresponds to the uncertainty of forecasting. This PBF

Page 232: Forging New Frontiers: Fuzzy Pioneers I

228 I. Z. Batyrshin, L. Sheremetov

Fig. 5 Fuzzy perception based S-curve modeling “Growth stage”

can be used for forecasting of sells of a new product. Perception based functionscorresponding to “Maturity” and “Decline” stages can be reconstructed similarly.

Perception based functions give natural and flexible tools for modeling humanexpert knowledge about function’s shapes. They have the following advantage overlogarithmic, exponential and other mathematical functions usually used in suchmodeling: PBF can be easily composed from different segments, each segment hasnatural linguistic description corresponding to human perceptions, each perceptionbased pattern can be easily tuned due to its parametric definition to be better matchedboth with expert opinion and with training data if they are available, fuzzy functionobtained as a result of reconstruction of PBF gives possibility to model differenttypes of uncertainty presented in forecasting.

4 Conclusions

Human decision making in economics, finance, industry, natural systems etc is oftenbased on analysis of time series data bases. Such decision making procedures useexpert knowledge usually formulated linguistically and perceptions obtained as aresult of analysis of TSDB. CWP and PTSDM can serve as main constituents ofintelligent decision making systems which use expert knowledge in decision makingproblems related with TSDB. The goals of perception based time series data miningis to support procedures of computing with words and perceptions in TSDB do-mains. The methods of representation and manipulation by perceptions and methods

Page 233: Forging New Frontiers: Fuzzy Pioneers I

Towards Perception Based Time Series Data Mining 229

of extraction of perceptions from TSDB relevant to decision making problemsshould adopt methods developed in fuzzy sets theory, computational linguistic, syn-tactic pattern recognition, time series analysis and time series data mining. Some ap-proaches to modeling of perception based patterns in time series data bases domainsare discussed in this paper. The further directions of research will use adaptation ofmodels and effective algorithms developed in TSDM to representation and process-ing of fuzzy perceptions, development of methods of computing with words andperceptions for modeling decision-making procedures in TSDB domains, develop-ment of decision making systems based on CWP and PTSDM in specific problemareas. Some methods of application of soft computing methods in data mining andtime series analysis can be found in [3, 11, 14, 17, 19, 21, 22, 27, 29, 30, 31, 34, 35].

Acknowledgments The research work was supported by projects D.00006 and D.00322 of IMP.

References

1. Agrawal, R., Psaila, G., Wimmers, E.L., Zait M.: Querying shapes of histories. Proc. 21stIntern. Conf. Very Large Databases, VLDB ’95, Zurich, Switzerland (1995) 502–514

2. Allen, J.F.: Maintaining knowledge about temporal intervals. Comm. ACM. 26 (11) (1983)832–843

3. Baldwin, J.F., Martin, T.P., Rossiter, J.M.: Time series modeling and prediction using fuzzytrend information. Proc. Int. Conf. SC Information/Intelligent Syst. (1998) 499–502

4. Batyrshin, I.: Construction of granular derivatives and solution of granular initial value prob-lem. In: Nikravesh, M.; Zadeh, L.A.; Korotkikh, V. (eds.). Fuzzy Partial Differential Equationsand Relational Equations. Reservoir Characterization and Modeling. Studies in Fuzziness andSoft Computing, Vol. 142. Springer-Verlag, Berlin Heidelberg New York (2004) 285–307

5. Batyrshin, I., Cosultchi, A., Herrera-Avelar, R.: Association network of petroleum wells basedon partial time series analysis. Proc. Workshop Intel. Comput., Mexico (2004) 17–26

6. Batyrshin, I., Herrera-Avelar, R., Sheremetov, L., Panova, A.: Association networks in timeseries data mining. NAFIPS 2005, USA (2005) 754–759

7. Batyrshin, I., Herrera-Avelar, R., Sheremetov, L., Suarez, R.: On qualitative description of timeseries based on moving approximations. Proc. Intern. Conf. Fuzzy Sets and Soft Computingin Economics and Finance, FSSCEF 2004, Russia, vol. I, (2004) 73–80

8. Batyrshin, I., Herrera-Avelar, R., Sheremetov, L., Suarez, R.: Moving approximations in timeseries data mining. Proc. Int. Conf. Fuzzy Sets and Soft Computing in Economics and FinanceFSSCEF 2004, St. Petersburg, Russia, v. I, (2004) 62–72

9. Batyrshin, I., Wagenknecht, M.: Towards a linguistic description of dependencies in data. Int.J. Applied Math. and Computer Science. Vol. 12 (2002) 391–401

10. Batyrshin, I.Z.: On reconstruction of perception based functions with convex-concave patterns.Proc. Int. Conf. Computational Intelligence ICCI 2004, Nicosia, North Cyprus, Near EastUniversity Press (2004) 30–34

11. Bezdek, J. C.: Fuzzy models and digital signal processing (for pattern recognition): Is this agood marriage? Digital Signal Processing, 3 (1993) 253–270

12. Bowerman, B.L., O’Connell, R.T.: Time series and forecasting: an applied approach. Mass.Duxbury Press (1979)

13. Boyd, S.: TREND: A system for generating intelligent descriptions of time-series data. Proc.IEEE Intern. Conf. Intelligent Processing Systems (ICIPS1998) (1998)

14. Chen, G.Q., Wei, Q., Kerre E.E.: Fuzzy Data Mining: Discovery of Fuzzy General. Assoc.Rules. In: Recent Research Issues Manag. Fuzz. in DB, Physica-Verlag (2000)

Page 234: Forging New Frontiers: Fuzzy Pioneers I

230 I. Z. Batyrshin, L. Sheremetov

15. Cheung, J.T., Stephanopoulos, G.: Representation of process trends. -Part I. A formal repre-sentation framework. Computers and Chemical Engineering, 14 (1990) 495–510

16. Das, G., Lin, K.I., Mannila, H., Renganathan, G., Smyth, P.: Rule discovery from time series.Proc. KDD98 (1998) 16–22

17. Dubois, D., Hullermeier, E., Prade, H.: A note on quality measures for fuzzy association rules.IFSA 2003, LNAI 2715, Springer-Verlag (2003) 677–648

18. Hand, D., Manilla, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001)19. Höppner, F.: Learning Temporal Rules from State Sequences. IJCAI Workshop on Learning

from Temporal and Spatial Data, Seattle, USA (2001) 25–3120. Jang, J.-S.R., Sun, C.T., Mizutani, E.: Neuro-Fuzzy and Soft Computing. A Computational

Approach to Learning and Machine Intelligence. Prentice-Hall International (1997)21. Kacprzyk, J., Zadrozny, S.: Data Mining via Linguistic Summaries of Data: An Interactive

Approach, IIZUKA’98, Iizuka, Japan (1998) 668–67122. Kandel, A., Last, M., Bunke, H. (eds): Data Mining and Computational Intelligence, Physica-

Verlag, Studies in Fuzziness and Soft Computing, Vol. 68. (2001)23. KDnuggets: Polls: Time-Series Data Mining (Nov 2004). What Types of TSDM You’ve Done?

http://www.kdnuggets.com/polls/2004/time_series_data_mining.htm24. Klir, G.J., Clair, U.S., Yuan, B.: Fuzzy Set Theory: Foundations and Applications, Prentice

Hall Inc. (1997)25. Konstantinov, K.B., Yoshida, T.: Real-time qualitative analysis of the temporal shapes of (bio)

process variables. American Inst. Chem. Eng. Journal 38 (1992) 1703–171526. Kukich, K.: Design of a knowledge-based report generator. Proc. 21st Annual Meeting of the

Association for Computational Linguistics (ACL-1983) (1983) 145–15027. Last, M., Klein, Y., Kandel A.: Knowledge discovery in time series databases. IEEE Trans.

SMC, Part B, 31 (2001) 160–16928. Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with impli-

cations for streaming algorithms. Proc. 8th ACM SIGMOD Workshop on Research Issues inData Mining and Knowledge Discovery. San Diego, CA. (2003)

29. Maner, W., Joyce, S.: WXSYS: Weather Lore + Fuzzy Logic = Weather Forecasts. Present.1997 CLIPS Virtual Conf. http://web.cs.bgsu.edu/maner/wxsys/wxsys.htm (1997)

30. Mitra, S., Pal, S.K., Mitra, P.: Data Mining in Soft Computing Framework: A Survey. IEEETransactions Neural Networks, 13 (2002) 3–14

31. Pons, O., Vila, M.A., Kacprzyk J. (eds.): Knowledge Management in Fuzzy Databases,Physica-Verlag (2000)

32. Stockman, G., Kanal, L., Kyle, M.C.: Structural pattern recognition of carotid pulse wavesusing a general waveform parsing system. CACM 19 (1976) 688–695

33. StockReporter. http://www.ics.mq.edu.au/∼ltgdemo/StockReporter/about.html34. Sudkamp, T. Examples, counterexamples, and measuring fuzzy associations. Fuzzy Sets and

Systems, 149 (2005) 57–7135. Yager, R.R.: On linguistic summaries of data. In: Piatetsky-Shapiro, G., Frawley, B. (eds.):

Knowledge Discovery in Databases, MIT Press, Cambridge, MA (1991) 347–36336. Yu, J., Reiter, E., Hunter, J., Mellish C.: Choosing the content of textual summaries of large

time-series data sets. Natural Language Engineering (2005) (To appear)37. Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision

processes. IEEE Trans. Systems, Man and Cybernetics SMC-3 (1973) 28–4438. Zadeh, L.A.: From computing with numbers to computing with words - from manipulation

of measurements to manipulation of perceptions. IEEE Trans. on Circuits and Systems - 1:Fundamental Theory and Applications, vol. 45, (1999) 105–119

Page 235: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoningfor Intelligent Semantic Web Systems

Ronald R. Yager

Abstract Our concern is with the development of tools useful for the constructionof semantically intelligent web based systems. We indicate that fuzzy set based rea-soning systems such as approximate reasoning provide a fertile framework for theconstruction of these types of tools. Central to this framework is the representationof knowledge as the association of a constraint with a variable. Here we study oneimportant type of variable, veristic, These are variables that can assume multiple val-ues. Different types of statements providing information about veristic variables aredescribed. A methodology is presented for representing and manipulating veristicinformation. We consider the role of these veristic variables in databases and de-scribe methods for representing and evaluating queries involving veristic variables.

1 Introduction

An important aspect of the future of man-machine interaction will involve searchthrough through a body of text. Particular examples of this question-answeringsystems and agent mediated searchers. The future development of the semanticweb will greatly help in this task (Berners-Lee, Hendler and Lassila 2001; Fensel,Hendler, Lieberman and Wahlster 2003; Antoniou and van Harmelen 2004; Passin2004). A fundamental requirement here is a semantically rich formal mechanism forreasoning with the kinds of information expressed in terms of a natural language.Stating with Zadeh’s 1965 paradigm changing paper more and more sophisticatedfuzzy set based semantic technologies have been developed with these required ca-pabilities. We particularly note the theory of approximate reasoning (Zadeh 1979a),the paradigm of computing with words (Zadeh 1996) and the generalized theory ofuncertainty (Zadeh 2005). While these were developed by Zadeh other researchersto numerous to mention have made important contributions to the development ofa fuzzy set based semantics (Dubois and Prade 1991; Novak 1992). A recent book

Ronald R. YagerMachine Intelligence Institute, Iona College, New Rochelle, NY 10801e-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 231C© Springer 2007

Page 236: Forging New Frontiers: Fuzzy Pioneers I

232 R. R. Yager

edited by Sanchez (2006) focuses on the interaction between fuzzy sets and thesemantic web.

2 Approximate Reasoning

The fuzzy set based theory of approximate reasoning (AR) can provide the formalmachinery needed to help in the construction of intelligent semantic web systems. Acentral component of AR is the representation of knowledge by the association of aconstraint to a variable. Zadeh [2005] discussed the idea of a generalized constraintand suggested some of the types of variables needed to represent various kinds ofhuman knowledge, among these were veristic and possibilistic variables.

While our focus here is on veristic variables (Yager 2000), a basic understandingof possibilistic variables is needed, because of their central role in the theory of ap-proximate reasoning. Possibilistic variables (Zadeh 1978; Dubois and Prade 1988b)have been widely studied in the approximate reasoning literature and have beenoften used to represent linguistic information. An important characteristic of possi-bilistic variables is the fact that they are restricted to have one and only one value.Variables such as Bernadette’s age, the current temperature or your father’s nameare examples of possibilistic variable, they can only assume one value. While theactual value of a possibilistic variable is unique, our knowledge about this value canbe uncertain. This is especially true when our knowledge comes from informationdescribed in a natural language. In order to represent this uncertainty, sets and par-ticularly fuzzy sets play an important role in expressing our knowledge of the valueof a possibilistic variable. For example if U is the variable Bernadette’s age then theknowledge “Bernadette is young” can be expressed in the language of AR as U isA where A is a fuzzy subset the domain of ages, A(x) indicates the possibility thatBernadette is x years old. We note that the special case where A is a crisp set withone element corresponds to the situation in which the value of U is exactly known.Thus the knowledge “Bernadette is 21” is represented as U is A where A = {21}. Ofparticular significance to us is that a well established machinery has been developed,within the theory of AR, for the manipulation of possibilistic information which weshall feel free to draw upon (Zadeh 1979a; Yager and Filev 1994).

In addition to playing a fundamental role in AR possibilistic variables have alsobeen directly used in databases (Zemankova and Kandel 1984; Petry 1996). Theiruse in databases have been primarily in two ways. One use has been to representlinguistic quantities appearing as attribute values in a database. In this usage wewould represent the age of person who is known to be young by a possibilisticdistribution obtained from the fuzzy subset young. The second and currently moreprevalent use in database applications, is to use possibility distributions to representflexible queries (Andreasen, Christianisen and Larsen 1997; Bosc and Prade 1997).

A second class of variables available in approximate reasoning are veristic vari-ables (Yager 1982; 1984; 1987a; 1987b; 1988; 2000; 2002). These type of variables,as opposed to possibilistic ones, are allowed to have any number of solutions. Oneexample of a veristic variable is the variable children of Masoud. The friends of

Page 237: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 233

Didier is another example of a veristic variable. The temperatures I like as well asdays I worked this week are also examples of a veristic variable. We note that eachof these variables can have multiple or no solutions. As we shall see here, as in thecase of possibilistic variables, sets will play a central role in expressing informationabout the variable. However here, the sets will be used to represent the multiplicityof solutions rather than the uncertainty. Fuzzy sets will also be used to allow fordegree of solution.

3 Expressing Veristic Type Knowledge

Yager [2000] provided using an extension of Zadeh’s (1997; 2005) Copula notationfor the representation of statements involving veristic variables within the languageof approximate reasoning, Let V be a veristic variable taking its values in the set Xthat is V can attain any number of values in X. Zadeh [1997] suggested the symbolicrepresentation of a canonical statement involving veristic information based on theassociation of a fuzzy subset A with a veristic variable as V is ν A. Here the Copulais ν is used to indicate that the variable is veristic. We note for possibilistic variableswe use the copula is. In this framework the membership grade of x in A, A(x),indicates the truth of the statement that x is a value of V given the statement Vis ν A. If V is the variable languages spoken by Janusz then the statement V isν{French, Spanish, English, Polish, Russian, Italian, German} indicates that Januszspeaks the seven languages in the set. Fuzzy sets enter the picture when there existssome gradation as to the degree to which an element belongs to the set. Let V be thevariable women Lotfi likes, which is a veristic variable. Let the fuzzy subset blondwomen be defined as

Blond = {1/Alice, 0.7/Barbara, 0.5/Carol, 0.3/Debbie, 0/Esther}.

The statement Lotfi likes blonds is expressible as V is ν Blond. In this case Blond(Barbara) = 0.7 indicates the veracity of the statement that Lotfi likes Barbara. Thestatement Henrik likes warm weather can be seen to induce a veristic type statementV is ν Warm where V is the veristic variable temperatures Henrik likes and Warm isa fuzzy subset over the space of temperature.

In order to clearly understand the distinction between possibilistic variables andveristic variables consider the following. Let U and V be possibilistic and veristicvariables respectively and let A = {1, 2, 3}. Consider the two AR statements U is Aand V is ν A. The first statement conveys the knowledge

U = 1 or U = 2 or U = 3

while the second conveys the knowledge

V = 1 and V = 2 and V = 3

Page 238: Forging New Frontiers: Fuzzy Pioneers I

234 R. R. Yager

Yager [2000] identified four different types of canonical statements involving veris-tic variables and used these to motivate a refinement of the use of veristic statementwithin the language of AR. Consider the statements

S-1 John likes blondsS-2 John likes only blondsS-3 John doesn’t like blondsS-4 John only doesn’t like blonds.

We see that statements S-1 and S-2 are positive statements in that they indicate ob-jects that are solutions to our variable, women John Likes. Statements S-3 and S-4are negative statements; they indicate objects that are not solutions to our variable.Along a different dimension statements S-1 and S-3 are of the same type whilestatements S-2 and S-4 are of the same type. We see that S–1 and S-3 are “open”statements in the sense that while S-1 indicates that John likes blonds it doesn’tpreclude him from liking other women. Thus one would not necessarily find contra-dictory the two statements

John likes blondsJohn likes redheads

Thus these type of statement are open to additions. The same is true of S-3, whilestating that John doesn’t like blonds and additional statements indicating other peo-ple John doesn’t like would not be contradictory. On the other hand the statementsS-2 and S-4 are “closed” statements in that they preclude any other elements fromjoining.

Based upon these observations, Yager [2000] suggested that in expressing veris-tic information in the language of the theory of approximate reasoning we must,in addition to denoting that a variable is veristic, include information about themanner of association of the variable with its value. In the preceding we essentiallyindicated that there are two dimensions, one along a scale of positive/negative andthe other along a scale open/closed. In order to include this information with theAR expression, it was suggested using two parameters with the veristic Copula is ν.These parameters are c for closed and n for negative, the lack of appearance of aparameter means its opposite holds. This

is ν means open and positiveis ν(n) means open and negativeis ν(c) means closed and positiveis ν(c,n) means closed and negative

Using these Copulas and letting V indicate the variable women John likes and Bindicate the subset blond, we can express the statements S-1 to S-4 as

S-1 V is ν BS-2 V is ν(c) B

Page 239: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 235

S-3 V is ν(n) BS-4 V is ν(n, c) B.

Now that we have a means for expressing statements involving veristic variables weturn to the issue of representing the knowledge contained in these statements in amanner we can manipulate within the AR system.

4 Representing and Manipulating Veristic Information

As approximate reasoning has a powerful machinery for manipulating statementsinvolving possibilistic variables. Yager [2000] suggested a methodology for convert-ing statements involving veristic variables into propositions involving possibilisticvariables. The possibilistic type information could then be manipulated using ARand then the results retranslated into statements involving veristic variables. Figure 1shows the basic idea. Ideally, as we get more experience with this process we shalllearn how to directly manipulate veristic information and circumvent the translationsteps.

Translationinto

PossibilisticStatements

VeristicInformation

VeristicStatement

KnowledgeManipulation

Retranslation

Fig. 1 Translation Process

We now describe Yager’s method for expressing veristic information in termsof possibilities. Let V be a veristic variable taking its values in the space X. Weassociate with V a possibilistic variable V∗ taking its value in the space IX, the setof all fuzzy subsets of X. Thus V∗ has one and only value, a fuzzy subset of X. Therelationship between V and V∗ is that V∗ is the set of all solutions to V. Let V bethe veristic variable week-days. We can have veristic statements such as

Tuesday and Thursday are week-days (V is ν {Tuesday, Thursday})as well asSaturday is not a week day (V is ν(n) {Saturday}).On the other hand the set of all solutions to V, is {Monday and Tuesday and

Wednesday and Thursday and Friday},

V∗ i s (1

{Monday, Tuesday, Wednesday, Thursday, Friday})

Yager [2000] noted that statements involving a veristic variable V tells us somethingabout the value of associated possibilistic variable V∗. For example, with V thevariable corresponding to people John likes, then the veristic statement John likesMary and Susan, V is ν {Mary, Susan} indicates that the set that is the solution to

Page 240: Forging New Frontiers: Fuzzy Pioneers I

236 R. R. Yager

V∗, the set of all the people John likes, must contain the elements Mary and Susan.Essentially this statement says that all sets containing Mary and Susan are possiblesolutions of V∗.

Motivated by this, Yager [2000] suggested a means for converting statementsinvolving veristic variables into those involving the associated possibilistic variable.As seen from the above the concept of inclusion (containment) plays a fundamentalrole in translating veristic statements into possibilistic ones. Specifically the state-ment that V is ν A means that any fuzzy subset containing A is a possible solutionfor V∗. More formally, as a result of this statement if F is a fuzzy subset of X thenthe possibility that F is the solution of V∗ is equal to degree to which A is containedin F. This situation requires us introduce a definition for containment. As discussedby Bandler and Kohout [1980] there are many possible definitions of containmentin the fuzzy framework. We have chosen to use Zadeh’s original definition of con-tainment, if G and H are two fuzzy subsets on the domain X then G is said to becontained in H if G(x) ≤ H(x) for all x in X. We emphasize here that this is not theonly possible definition of containment that can be used. However this choice doeshave the very special property that it is a binary definition, Deg(G ⊆ H) is eitherone or zero. This leads to a situation in which the possibility of a fuzzy subset ofX being the solution to V∗ is either one or zero. This has the benefit of allowingus to more clearly see the processes involved in manipulating veristic informationwhich in turn will help us develop our intuition at this early of our work with veristicinformation.

In the following we list the translation rules, under the assumption of the Zadehdefinition of containment, for the four canonical types veristic statements.

I. V i s ν A⇒ V∗ is A∗1

where A∗1 is a subset of IX such that for any fuzzy subset F ∈ IX its membershipgrade is defined by

A∗1(F) = 1 if A(x) ≤ F(x) for all x in X (A ⊆ F)

A∗1(F) = 0 if A � F

Thus any F for which A∗1(F) = 1 is a possible solution for V∗.

II. V i s ν(c) A⇒ V∗ is A∗2

where A∗2 is a subset of Ix such that

A∗2(A) = 1

A∗2(F) = 0 for F �= A

We see here that the statement V is ν(c) A induces an exact solution for V∗, A.

Page 241: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 237

III. V i s ν(n)A⇒ V∗ is A∗3

where A∗3 is a subset of IX such that for any F ∈ IX

A∗3(F) = 1 if F(x) ≤ 1− A(x) for all x in X (F ⊆ A)

A∗3(F) = 0 if F � A

Here again we see that we get a set of possible solutions, essentially those subsetsnot containing A.

IV. V i s ν(n, c) A⇒ V∗ is A∗4

where A∗4 is a subset of IX such that

A∗4(A) = 1

A∗4(F) = 0 for all F �= A

Here we see we again get an exact solution for V∗, A.Once having converted our information involving veristic variables into state-

ments involving the associated possibilistic variable we can use the machinery ofapproximate reasoning to manipulate, fuse, and reason with the information. Inmaking this conversion we pay a price, the domain of V is X while the domainof V∗, IX, is a much more complex and bigger space.

While in general the manipulation of pieces of information involving veristicvariables requires their conversion into possibilistic statements, some of these op-erations can be performed directly with the veristic statements. Among the mostuseful are the following which involve combinations of like statement:

1. V is ν A1 and V is ν A2.and . . . . . . . . . and V is ν An ≡ V is ν(∪jAj)

The “anding” of open affirmative veristic statement results in the union of the asso-ciated sets.

John likes {Mary, Bill, Ann} and John likes {Tom, Bill}⇒ John likes {Mary, Bill,Ann, Tom}

2. V is ν A1 or V i s ν A2 or . . . . . . . . . or V i s ν An ≡ V i s ν(∩jAj)

The “oring” of open affirmative veristic statement results in the intersection of theassociated sets.

John likes {Mary, Bill, Ann} or John likes {Tom, Bill}⇒ John likes {Bill}

3. V i s ν(n) A1 and V i s ν(n) A2.and . . . . . . . . . and V i s ν(n) An ≡ V i s ν(n)(∪jAj)

4. V i s ν(n) A1 or V i s ν(n) A2 or . . . . . . . . . or V i s ν(n) An ≡ V i s ν(n)(∩jAj)

Page 242: Forging New Frontiers: Fuzzy Pioneers I

238 R. R. Yager

We emphasize the unexpected association of union with the “and” and intersectionwith “or” in the above combination rules.

Often we are interested in having information about a particular element in X,such as whether x1 is a solution of V, this kind of information is not readily apparentfrom statements involving V∗. In order to more readily supply this kind of infor-mation about elements in X based upon statements involving V∗, we introduce twodistributions on X induced by a statement involving V∗, these are called the verityand possibility distributions.

Let V be a variable taking its values in the space X From the statement V∗ isW, where W is a fuzzy subset of IX we can obtain a verity distribution on X, Ver:X→ I, where I = [0, 1], such that

Ver(x) = MinF∈Ix [F(x) ∨ –W(F)]

In the special case when W is a crisp subset of IX then this reduces to

Ver(x) = MinF∈W [F(x)]

Here we see that Ver(x) is essentially the smallest membership grade of x in anysubset of IX that can be the solution set to V∗. Ver(x) can be viewed as a lowerbound or minimal support for the truth of the statement “x is a solution of V.” Themeasure of verity is closely related to the measure of certainty, necessity, used withpossibilistic variables although a little less complex. This additional complexity withpossibilistic variables is a result of the fact that we have one and only one solutionfor a possibilistic variable, this requires that in determining the certainty of x wemust take into account information about possibility of other solutions. While in thecase of veristic variables, because of allowability of any number of solutions, thedetermination of whether x is a solution to V is based solely on information about xand is independent of the situation with respect to any other element in the space X.

We can show that the following are the verity distribution associated with thefour canonical veristic statement

V is ν A⇒ Ver(x) = A(x)V is ν(n) A⇒ Ver(x) = 0V is ν(c) A⇒ Ver(x) = A(x)Ver is ν(c, n) A⇒ Ver(x) = 1-A(x)

It should be emphasized that the statement V i s ν(n) A has Ver(x) = 0 for all A,thus it never provides any support for the statement x is a solution of V, whether xis contained in A or not.

We also note that since an “anding” of a collection of statements of the formV i s ν Aj results in V i s ν(∪jAj) then the verity distribution associated with this“anding” is

Ver(x) = Maxj[Aj(x)]

Page 243: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 239

On the other hand since an “oring” of statements of the form V i s ν Aj results in Vi s ν(∩jAj) then the verity distribution associated with this “oring” is

Ver(x) = Minj[Aj(x)]

In the case of negative statements of the form V i s ν(n) Aj, both the “anding” and“oring” of these kinds of statements result in Ver(x) = 0.

The second distribution on X generated from V∗ is W is called the possibilitydistribution and is denoted Poss: X→ I and is defined as

Poss(x) = MaxF∈IX[F(x) ∧W(F)].

In the special case when W is crisp this reduces to

Poss(x) = MaxF∈W[F(x)].

Here we see that Poss(x) is the largest membership grade of X in any subset of Ix thatcan be a potential solution of V∗. In this respect Poss(x) can be seen as upper boundor maximal support for the truth of statement “x is a solution of V.” This measure isexactly analogous to the possibility measure used with possibilistic variables.

We can be shown that the following are the possibility distributions associatedwith the four canonical type veristic statement

V i s ν A⇒ Poss(x) = 1V i s ν(n) A⇒ Poss(x) = 1− A(x)

V i s ν(c) A⇒ Poss(x) = A(x)

V i s ν(c, n) A⇒ Poss(x) = 1− A(x)

The “anding” and “oring” of statements of the form V i s ν Aj results in a possi-bility distribution such that Poss(x) = 1. In the case of statements of the form Vi s ν(n) Aj their “anding” results in a possibility distribution such that Poss(x) = 1 −Maxj[Aj(x)] while for their “oring” we get Poss(x) = 1 −Minj[Aj(x)].

It is also noting that in the case of the canonical closed statement V i s ν(c) Aboth veracity and possibility attain the same value, A(x). Similarly for the canonicalclosed statement V i s ν(c, n) A both veracity and possibility attain the same value,1 − A(x). This fact is not surprising in that these type of statements imply that thesolution set V∗ is exactly known, it is A.

5 Basic Querying of Relational Databases ContainingVeristic Variables

A relational database provides a structured representation of knowledge in the formof a collection of relations. A relation in the database can be viewed as a table in

Page 244: Forging New Frontiers: Fuzzy Pioneers I

240 R. R. Yager

which each column corresponds to an attribute and each row corresponds to a recordor object. The intersection of a row and a column is a variable indicating the value ofthe row object for the column attribute. Normally one associates with each attributea set called its universe constituting the allowable values for the attribute. One desir-able characteristic of good database design is to try to make the tables in a databasehave a very natural semantics: tables in which in each of the records, i.e. the rows,correspond to a real entity are in this spirit. Tables such as employees, customersand parts are examples of these types of real entity objects. Another semanticallyappealing type of table are those used to connect entity objects, these can be seenas linking tables. Another class of tables which have a semantic reality are thosewhich define some concept. For example a table with just one attribute, states, canbe used to define a concept such as Northeast. We can call these concept definitiontables.

One often imposed requirement in the design of databases is the satisfaction ofthe first normal form. This requires that each variable in a database is allowed totake one and only one element from its domain. The satisfaction of this requirementoften leads to an increase in the number of tables needed to represent our knowledge.In addition its satisfaction often requires us to compromise other desired featuresand to sometimes sacrifice semantic considerations. For example if we have in adatabase one table corresponding to the entity, employee, and if we want to includeinformation about the names of an employee’s children, this must be stored in adifferent table.

One reason for the imposition of the requirement of satisfying the first normalform is the lack of a mechanism for querying databases in which attribute valuescan be multi-valued rather than single valued. The introduction of the idea of averistic variable provides a facility for querying these types of multiple solutionedvariables.

Here we shall begin look at the issue of querying databases with veristic vari-ables. We here note the previous related work on this issue (Dubois and Prade 1988a;Vandenberghe, Van Schooten, De Caluwe and Kerre 1989). Assume V is a variablecorresponding to an attribute of a relational database which can have multiple solu-tions, such as the children of John. In this case V is a veristic variable. We shall let Xindicate the domain of V. Typically the information about this variable is expressedin terms of a canonical veristic statement of the form V is ν(.,.) A, John’s childrenare Mary, Bill and Tom. In order to use this information we need to be able toanswer questions about this variable based on this information: is Tom a child ofJohn? Ideally we desire a system in which the information is expressed in terms of averistic variable, V is ν(.,.) A, and answers are obtained in terms of A. However, inorder to develop the formalism necessary to construct a question-answering systemwe need to draw upon the possibility based theory of approximate reasoning. Thisnecessitates that we introduce the associated possibilistic variable V∗. Thus we shallassume that our information about V can be expressed as V∗ is W where W ⊆ IX.Since we have already indicated a means for translating statements involving V intothose involving V∗ this poses no problem.

In the following by a question we shall mean one in which we try to match thevalue of some attribute to some specified value. Typically, for any object, the answer

Page 245: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 241

to this question is some measure of truth that the attribute value of the object satisfiesthe query value.

In order to provide the machinery for answering questions, we need to introducesome tools from the theory of approximate reasoning. Assume U is a possibilisticvariable taking its value in the space Y. An example useful here is the variable theAge of John. Consider the situation in which we know that John is over thirty. If weask whether John is twelve years old, the answer is clearly no. If we ask whetherJohn is over twenty-one, the answer is clearly yes. However, it is not always the casethat we can answer every relevant question using our information. Consider thequestion, “Is John forty years old?” or the question, “Is John over fifty?”, neitherof which can be answered. In these cases the best answer is “I don’t know”. Theproblem here is that there exists some uncertainty in our knowledge; all we know isthat John is over thirty. If we knew the exact age of John, with no uncertainty, thenany relevant question can be answered.

The theory of approximate reasoning, which was developed to implement rea-soning with uncertain information, has formally looked into the issue of answeringquestions in the fact of uncertainty.

With U as our possibilistic variable consider the situation in which we have theknowledge that U is A and want to know if U is B is true. Formally, we have data thatU is A and we desire to determine the Truth[U is B/ U is A]. As we have indicated,it is often impossible to answer this question when there exists some uncertainty asto the value of U. We should note that the problem exists even if we allow the truthvalue to lie in the unit interval rather than being binary.

Within approximate reasoning two bounding measures for the Truth[U is B/U is A] have been introduced, Poss[U is B/ U is A] and Cert[U is B/ U is A](Zadeh 1979b). Poss[U is B/ U is A] is defined as

Poss[U is B/ U is A] = Maxy[B(y)∧ A(y)]

It provides an upper bound on the Truth[U is B/ U is A]. The other measure, thecertainty, is defined as

Cert[U is B/ U is A] = 1 – Maxy[B(y) ∧A(y)]

and it provides a lower bound on the Truth[U is B/ U is A].If Cert[U is B/ U is A] = Poss[U is B/ U is A] = α then the Truth[U is A/ U is

B] is precisely known, its value is α. Two particular cases of this are when α = 1and α = 0. In the first case the truth is one, this is usually given the linguisticdenotation TRUE, in the second case the truth is zero, this is usually given thelinguistic denotation FALSE. Another special case is when Poss[U is B/ U is A] = 1and Cert[U is B/ U is A] = 0 in this case the truth value can be any value in the unitinterval and this situation is given the linguistic denotation, UNKNOWN.

We shall now use these surrogate values to provide a machinery for answeringquestions about veristic attributes in a database. Assume our knowledge about theveristic attribute V can be expressed as V∗ is W. Here we recall that V∗ is the asso-ciated possibilistic variable and W is a fuzzy subset of IX, X being the domain of V.

Page 246: Forging New Frontiers: Fuzzy Pioneers I

242 R. R. Yager

Further assume our query is expressible as V∗ is Q, here again Q is a fuzzy subsetof IX. Using the measures just introduced we get

Poss[V∗ i s Q/V∗ i s W] = MaxF[Q(F) ∧W(F)]

Cert[V∗ i s Q/V∗ i s W] = 1−MaxF[Q(F) ∧W(F)]

Here we note that F ∈ IX, the Max is taken over all fuzzy subset of X.Let us consider the question “is x1 a solution of V?” We note that this statement

be expressed in AR as V i s ν{x1}? As we have indicated this type of statement canbe translated as V∗ is Q where Q is a fuzzy subset of I∗ such that for each F ∈ IX,

Q(F) = Deg({x1} ⊆ F).

Here we shall use the definition of inclusion suggested in Bandler and Kohout [2], ifA and B are two fuzzy subsets of X then Deg(A ⊆ B) = Minx[A(x)∨B(x)]. Usingthis in our query we get

Q(F) = Deg(Ex1 ⊆ F) = Minx[Ex1(x)∨ F(x)]

here Ex1 indicates the set consisting solely of x1. Since Ex1(x) = 0 for x = x1 andEx(x) = 1 for x �= x1 then

Q(F) = Minx(Ex1(x) ∨ F(x)] = F(x1).

Using this we can now answer the question “Is x1 a solution to V given V∗ is W”by obtaining the possibility and certainty:

Poss[Vi s ν{x1}/V∗ i s W] = Poss[V∗ i s Q/V∗ i s W] = MaxF[Q(F) ∧W(F)]= MaxF[F(x1) ∧W(F)]

and

Cert[V i s ν{x1}/V∗ i s W] = 1− Poss[V∗ i s Q/V∗ i s W]

= 1 – MaxF[Q(F) ∧W(F)]

= MinF[Q(F) ∨W(F)]

= MinF[F(x1) ∨W(F)]

It should be noted here that

Cert[V i s ν{x1}/V∗ is W] = Ver(x1)

Page 247: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 243

Poss[V i s ν{x1}/V∗ is W] = Poss(x1)

We further note that if V∗ is W is generated by one of the atomic types of state-ments V is ν(.,.) A then we can directly provide these possibility and certainty valuesfrom A:

DATA

Cert[V is ν{x}/ Data] Poss[V is ν{x}/ Data]

V is ν A A(x)V is ν(c) A A(x)A(x)V is ν(n) A 01−A(x)V i s ν(n,c) A 1−A(x)1−A(x)

It is interesting to note that in the cases in which our data is of the closed form, Vi s ν(c) A and V i s ν(c,n), then the truth of the question can be precisely answeredsince the possibility and certainty have the same value. In particular if we have Vi s ν(c) A then truth of the question of x being a solution of V is A(x) while in thecase when V i s ν(c,n) A the truth is 1 − A(x).

The preceding observation inspires us to consider a more general observation.Assume we have a query about a veristic variable V in a database that can beexpressed as V∗ is Q. If our information in the database is V∗ is W we indicatedthat

Poss[B∗ i sQ/V∗ i s W] = MaxF[Q(F) ∧W(F)]

Cert[B∗ i s Q/V∗ i s W] = 1−MaxF[Q(F) ∧W(F)]

which represent the upper and lower bounds on Truth [V∗ i s Q/V∗ i s W]. Considernow the case when the known data is an atomic statement of a closed affirmative typeV∗ i s W⇔ W i s ν(c) A. As we indicated in this case the set W is such that W(A)= 1 and W(F) = 0 for all F �= A From this we see MaxF(Q(F)∧W(F)) = Q(A). Wealso see that 1 − MaxF(Q(F) ∧W(F)) = 1-Q(A) = Q(A). Thus the possibility ofV∗ is Q given V∗ is W equals its certainty. From this we get

Truth[V∗ is Q/ V i s ν(c) A] = Q(A),

the membership grade of A in the query.

Consider now the case when the known data is an atomic statement of the closednegative type, V∗ is W ⇔ V i s ν(n,c) A. In this case we indicated that W(A) = 1and W(F) = 0 for all F �= A. From this we get that

Page 248: Forging New Frontiers: Fuzzy Pioneers I

244 R. R. Yager

MaxF[Q(F) ∧W(F)] = Q(A)

1 −MaxF[Q(F) ∧W(F)] = Q(A)

This of course implies that Poss[V∗ i s Q/V∗ i s ν (n, c)A] = Cert[V∗ is Q/ V isν(n, c) A] and hence

Truth[V∗ is Q/ V is ν(n, c) A] = Q(A)Thus we see that in the case of closed atomic veristic statements the truth value

of any question expressible as V∗ is Q can be uniquely obtained.This result is very useful especially in the light of the so called “closed world

assumption” (Reiter 1978) often imposed in database. Under the assumption a state-ment such as John’s children are Mary and Tom would be assumed to be a closedstatement, that is John’s only children are Mary and Tom.

6 Compound Querying

Let us now look at another question that we can ask about a veristic variable assum-ing the knowledge V∗ is W , “Are x1 and x2 solutions of V?” We can express thisquery as the statement V is ν A where A = {x1, x2}, so then our question becomesthat of finding the truth of V is ν A given V∗ is W. The statement V is ν A canbe represented as V∗ is Q where Q is a fuzzy subset of Ix in which Q(F) = Deg(A ⊂ F) = Minx[A(x) ∨ F(x)]. In this case

Poss[V∗ is Q/ V∗ is W] = MaxF[Q(F) ∧W(F)] = MaxF[Minx(A(x) ∨ F(x)) ∧W(F)]

Since A = {x1, x2} then

Poss[V∗ is Q/V∗ is W] = MaxF[F(x1) ∧ F(x2) ∧W(F)].

For the certainty we get

Cert[V∗ is Q/ V∗ is W] = 1 – MaxF[Q(F) ∧W(F)] = MinF[(F(x1) ∧ F(x2)) ∨W(F)]

With A(x1) = A(x2) = 1 and all others equal zero we got Q(F) = F(x1) ∧F(x2).

In the preceding we represented the question “Are x1 and x2 solutions of V?” as Vis ν A where A = {x1, x2} and then translated this into V∗ is Q where Q(F) = Deg(A⊂ F) = Minx[A(x)∨F(x)]. There exists an alternative and eventually more usefulway of getting the same result.

Page 249: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 245

We indicated in the preceding the query is x1 a solution of V can be expressed asV∗ is Q1 where Q1(F) = Deg(Ex1 ⊆ F) = Minx[Ex1(x) ∨ F(x)] = F(x1). Similarlythe query whether x2 is a subset of V can be expressed as V∗ is Q2 where Q2(F) =F(x2). Using this we can form the query as to whether x1 and x2 are solutions of Fas V∗ is Q where

V∗ i s Q = V∗ i s Q1 and V∗ i s Q2 = V∗ i s Q1 ∩Q2.

In this case we also get Q(F) = MinF[Q1(F), Q2(F)] = F(x1) ∧ F(x2) which leads tothe same value for the possibility and certainty.

More generally if we ask the question “are x1 and x2 and . . . and xm solutionsof V?” then our question can be expressed as

V∗ i s Q = V∗ i s Q1 and V∗ i s Q2 and . . . . . . . and V∗ i s Qm

= V∗ i s Q1 ∩ Q2 ∩ . . . . . . . . . ∩ Qm.

Since Qj(F) = F(xj) then Q(F) = Minx∈B[F(x)] where B = {x1, . . . . . . ., xm}.From this we get that

Poss[all x in B are solutions to V|V∗ i s W ] = MaxF[Minx∈B[F(x)] ∧W(F)]

Cert[all x in B are solutions to V|V∗ i s W] = MinF[Minx∈B[F(x)] ∨W(F)]

In the case when V∗ i s W⇔ V i s ν(c) A then W(A) = 1 and W(F) = 0 for F �= Ahence

Poss[all x in B are solutions to V/ V i s ν(c) A] = Minx∈B[A(x)]

Cert[all x in B are solutions to V/ V i s ν(c) A = Minx∈B[A(x)]

as expected these are the same indicating that the

Truth[all x in B are solutions to V/ V is ν(c) A] = Minx∈B[A(x)]

In the case when V∗ is W⇔ V i s ν(c, n) A then we also get a unique truth valueequal to

Truth[all x in B are solutions to V/ V i s ν(c) A] = Minx∈B[A—(x)]

If V∗ i s W ⇔ V i s ν A then W(F) = 1 if A ⊆ F and W(F) = 0 if F �⊂ A, whereA ⊆ F if F(x) ≤ F(x) for all x. Since A ⊆ X then

Poss[all x in B are solutions to V/ V i s νA] = 1

To calculate the certainty we need evaluate MinF[Minx∈B[F(x)] ∨W(F)]. Sincefor any F such that A �⊂ F we get W(F) = 0 and for those in which A ⊆ F we getW(F) = 1 then

Page 250: Forging New Frontiers: Fuzzy Pioneers I

246 R. R. Yager

Cert[all x in B are solutions to V/V i s ν A] = MinF[Minx∈B[F(x)] ∨W(F)]

Thus the certainty is the minimal membership grade of any of question elementin A.

In the case when our database has the value V i s ν(n) A then W(F) = 1 if F ⊆ Aand W(F) = 0 if F �⊂ A. Using this we get

Poss[all x in B/ V i s ν(n) A] = MaxF[Minx∈B[F(x)] ∧W(F)]= MaxF⊆A[Minx∈B[F(x)]]

This maximum occurs when F = A hence Poss[all x in B/ V i s ν(n) A] =Minx∈B[A(x)]. In this case Cert[all x in B/V i s ν(n)A] = MinF⊆A—[Minx∈B[F(x)]]= 0. Thus here the certainty is zero but the possibility is the minimal membershipgrade of any x in B in A.

We now turn to another, closely related query type. Consider the questionAre x1 or x2 solutions to V?

Again we shall assume that the value of veristic variable V in the database isV∗ i s W. We must now find an appropriate representation of this query, V∗ i s Q. Wefirst recall that the question is xj a solution to V, V i s ν{xj}? induces the statementV∗ i s Qj where Qj is a fuzzy subset of Ix such that Qj(F) = Deg(Exj ⊆ F} =Minx[Exj(x)∨F(x)], here Exj indicates the set consisting solely of xj. Since Exj(x) =0 for x = xj and Exj(x) = 1 for x �= xj, then Qj(F) = F(xj).

Our query as to whether x1 or x2 are solutions, can be expressed as

V∗isν Q1 or V∗isν Q2

thus our query is V∗ i s Q where Q(F) = Max(Q1(F), Q2(F)) = F(x1) ∨ F(x2)

Using this we getPoss [x1 or x2 is solutions/ V∗ i s W] = Poss[V∗ i s Q/V∗ i s W]

= MaxF[Q(F) ∧W(F)]

= MaxF[(F(x1) ∨ F(x2)) ∧W(F)]

andCert [x1 or x2 is solutions/ V∗ i s W] = 1− Poss[V∗ i s Q/V∗ i s W]

= MinF[(F(x1) ∨ F(x2)) ∨W(F)]

The generalization of this to any crisp subset B of X is straightforward, thus

Poss [∃x ∈ B that is solution of V/V∗ i s W] = MaxF[Maxx∈B[F(x)] ∧W(F)]

Page 251: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 247

Cert [∃x ∈ B that is solution of V/V∗ i s W] = MinF[Maxx∈B[F(x)] ∨W(F)]

We consider now the answer in the case of special forms for V∗ i s W. First weconsider the case where V∗ i s W ⇔ V i s ν A. Here W(F) = 1 if A ⊆ F and W(F)= 0 otherwise. Again since A ⊆ X we get

Poss[∃x ∈ B that is solution of V/V i s ν A] = 1The certainty in this case can be easily shown to be

Cert[∃x ∈ B that is solution of V/V i s ν A] = Maxx∈B[A(x)]In the case where V∗ i s W ⇔ V i s ν(c) A, W(A) = 1 and W(F) = 0 for all

F �= A, we getPoss[∃x ∈ B that is solution of V/V i s ν(c) A] = Cert[∃x ∈ B that is solution of

V/V i s ν(c) A] = Truth[∃x ∈ B that is solution of V/ V i s ν(c) A] = Maxx∈B[A(x)].When the knowledge is V∗ i s W ⇔ V i s ν(n) A, W(F) = 1 if F ⊆ A and

W(F) = 0 if F �⊂ A. thenPoss[∃x ∈ B that is solution of V/V i s ν(n)A] = Maxx∈B[A(x)]andCert[∃x ∈ B that is solution of V/V i s ν(n)A] = MinF�⊂A[Maxx∈B[F(x)]] = 0

since Ø⊂ A.When V∗ i s W ⇔ V i s ν(n, c) A we can show that both the possibility and

certainty equal Maxx∈B[A(x)].An interesting special case is when we have no information about the value of V.

In this case V∗ i s W where W(F) = 1 for all F. Under this circumstance

Poss[∃x ∈ B that is solution of V/V∗ i s W] = MaxF[Maxx∈B[F(x)]∧W(F)] = 1,since W(X) = 1.

Cert[∃x ∈ B that is solution of V/V∗ i s W] = MinF[Maxx∈B[F(x)]∨W(F)] = 0,since W(Ø) = 1

Poss[all x in B are solutions of V/V∗ i s W] = MaxF[Minx∈B[F(x)]∧W(F)] = 1Cert[all x in B are solutions of V/V∗ i s W] = 0

Actually this case corresponds to the situation where V∗ i s W⇔ V i s ν A whereA = Ø. Since for this type of information W(F) = 1 for all F such A

∏F and with

A = Øthen W(F) = 1 for all F.Another interesting case is when we have V i s ν(c) Ø. Note this is different

than the preceding. In this case we are saying that V has no solutions, while in thepreceding we are saying that we don’t know any solutions to V.

7 Conclusion

Veristic variables are variables that can assume multiple values such as the favoritesongs of Didier, books owned by Henri or Johns hobbies. Different types of state-ments providing information about veristic variables were described. A methodol-ogy, based upon the theory of approximate reasoning, was presented for representing

Page 252: Forging New Frontiers: Fuzzy Pioneers I

248 R. R. Yager

and manipulating veristic information. We then turned to the role of these veris-tic variables in databases. We described methods for representing and evaluatingqueries involving veristic variables.

References

39. T. Andreasen, H. Christianisen and H. L. Larsen, Flexible Query Answering Systems, KluwerAcademic Publishers: Norwell, MA, 1997.

40. G. Antoniou and F. van Harmelen, A Semantic Web Primer, The MIT Press: Cambridge, MA,2004.

41. W. Bandler and L. Kohout, L., “Fuzzy power sets and fuzzy implication operators,” Fuzzy Setsand Systems 4, 13–30, 1980.

42. T. Berners-Lee, J. Hendler and O. Lassila, “The semantic web,” Scientific America May, 2001.43. P. Bosc and H. Prade, “An introduction to the fuzzy set and possibility-based treatment of

flexible queries and uncertain or imprecise databases,” in Uncertainty Management of Infor-mation Systems, edited by Motro, A. and Smets, P., Kluwer Academic Publishers: Norwell,MA, 285–326, 1997.

44. D. Dubois and H. Prade, “Incomplete conjunctive information,” Computational MathematicsApplications 15, 797–810, 1988a.

45. D. Dubois and H. Prade, Possibility Theory : An Approach to Computerized Processing ofUncertainty, Plenum Press: New York, 1988b.

46. D. Dubois and H. Prade, “Fuzzy sets in approximate reasoning Part I: Inference with possibil-ity distributions,” Fuzzy Sets and Systems 40, 143–202, 1991.

47. V. Novak, The Alternative Mathematical Model of Linguistic Semantics and Pragmatics,Plenum Press: New York, 1992.

48. D. Fensel, J. Hendler, H. Lieberman and W. Wahlster, Spinning the Semantic Web, MIT Press:Cambridge, MA, 2003.

49. T. B. Passin, Explorers Guide to the SEmantic Web, Manning: Greenwich, CT, 2004.50. F. E. Petry, Fuzzy Databases Principles and Applications, Kluwer: Boston, 1996.51. R. Reiter, “On closed world data bases,” in Logic and Data Bases, Gallaire, H., & Minker, J.

(eds.), New York: Plenum, 119–140, 1978.52. E. Sanchez, Fuzzy Logic and the Semantic Web, Elsevier: Amsterdam, 2006.53. R. Vandenberghe, A. Van Schooten, R. De Caluwe and E. E. Kerre, “Some practical aspects

of fuzzy database techniques; An example,” Information Sciences 14, 465–472, 1989.54. R. R. Yager, “Some questions related to linguistic variables,” Busefal 10, 54–65, 1982.55. R. R. Yager, “On different classes of linguistic variables defined via fuzzy subsets,” Kybernetes

13, 103–110, 1984.56. R. R. Yager, “Set based representations of conjunctive and disjunctive knowledge,” Informa-

tion Sciences 41, 1–22, 1987a.57. R. R. Yager, “Toward a theory of conjunctive variables,” Int. J. of General Systems 13, 203–

227, 1987b.58. R. R. Yager, “Reasoning with conjunctive knowledge,” Fuzzy Sets and Systems 28, 69–83,

1988.59. R. R. Yager, “Veristic variables,” IEEE Transactions on Systems, Man and Cybernetics Part

B: Cybernetics 30, 71–84, 2000.60. R. R. Yager, “Querying databases containing multivalued attributes using veristic variables,”

Fuzzy Sets and Systems 129, 163–185, 2002.61. R. R. Yager and D. P. Filev, Essentials of Fuzzy Modeling and Control, John Wiley: New York,

1994.62. L. A. Zadeh, “Fuzzy sets,” Information and Control 8, 338–353, 1965.63. L. A. Zadeh, “Fuzzy sets as a basis for a theory of possibility,” Fuzzy Sets and Systems 1,

Page 253: Forging New Frontiers: Fuzzy Pioneers I

Veristic Variables and Approximate Reasoning for Intelligent Semantic Web Systems 249

3–28, 1978.64. L. A. Zadeh, “A theory of approximate reasoning,” in Machine Intelligence, Vol. 9, edited by

Hayes, J., Michie, D. and Mikulich, L. I., Halstead Press: New York, 149–194, 1979a.65. [27] L. A. Zadeh, “Fuzzy sets and information granularity,” in Advances in Fuzzy Set Theory

and Applications, edited by Gupta, M. M., Ragade, R. K. and Yager, R. R., North-Holland:Amsterdam, 3–18, 1979b.

66. [30] L. A. Zadeh, “Fuzzy logic = computing with words,” IEEE Transactions on Fuzzy Sys-tems 4, 103–111, 1996.

67. [31] L. A. Zadeh, “Toward a theory of fuzzy information granulation and its centrality inhuman reasoning and fuzzy logic,” Fuzzy Sets and Systems 90, 111–127, 1997.

68. [b] L. A. Zadeh, “Toward a generalized theory of uncertainty (GTU)-An outline,” InformationSciences 172, 1–40, 2005.

69. [33] M. Zemankova and A. Kandel, Fuzzy Relational Data Bases – A Key to Expert Systems,Verlag TUV Rheinland: Cologne, 1984.

Page 254: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perceptionand Cognition

Ashu M. G. Solo and Madan M. Gupta

Abstract Humans often mimic nature in the development of new machines orsystems. The human brain, particularly its faculty for perception and cognition,is the most intriguing model for developing intelligent systems. Human cognitiveprocesses have a great tolerance for imprecision or uncertainty. This is of greatvalue in solving many engineering problems as there are innumerable uncertaintiesin real-world phenomena. These uncertainties can be broadly classified as eitheruncertainties arising from the random behavior of physical processes or uncertain-ties arising from human perception and cognition processes. Statistical theory canbe used to model the former, but lacks the sophistication to process the latter. Thetheory of fuzzy logic has proven to be very effective in processing the latter. Themethodology of computing with words and the computational theory of perceptionsare branches of fuzzy logic that deal with the manipulation of words that act aslabels for perceptions expressed in natural language propositions. New computingmethods based on fuzzy logic can lead to greater adaptability, tractability, robust-ness, a lower cost solution, and better rapport with reality in the development ofintelligent systems.

1 Introduction

For a long time, engineers and scientists have learned from nature and tried tomimic some of the capabilities observed in humans and animals in electrical andmechanical machines. The Wright brothers started their work on the first airplaneby studying the flight of birds. Most scientists of the time thought that it was the

Ashu M. G. SoloMaverick Technologies America Inc., Suite 808, 1220 North Market Street, Wilmington, Delaware,U.S.A. 19801e-mail: [email protected]

Madan M. GuptaIntelligent Systems Research LaboratoryCollege of Engineering, University of Saskatchewan, Saskatoon, Saskatchewan, Canada S7N 5A9e-mail: [email protected]

M. Nikravesh et al., (eds.) , Forging New Frontiers: Fuzzy Pioneers I. 251C© Springer 2007

Page 255: Forging New Frontiers: Fuzzy Pioneers I

252 A. M. G. Solo, M. M. Gupta

flapping of wings that was the principle component of flying. However, the Wrightbrothers realized that wings were required to increase the buoyancy in air.

In biomedical engineering, the principles of natural science and engineering areapplied to the benefit of the health sciences. The opposite approach, reverse biologi-cal engineering, is used to apply biological principles to the solution of engineeringand scientific problems. In particular, engineers and scientists use this reverse engi-neering approach on humans and animals in developing intelligent systems.

The principle attributes of a human being can be classified in three categories(3 H’s): hands, head, and heart. The hands category refers to the physical attributesof humans. These physical attributes have been somewhat mimicked and somewhatimproved on to surpass the restrictive physical limitations of humans through suchmighty machines as the tractor, assembly line, and aircraft. The head category refersto the perception and cognition abilities of the brain. The restrictive reasoning lim-itations of humans have been surpassed through the ongoing development of mi-croprocessors. However, the challenge of creating an intelligent system is still in itsincipient stages. Finally, the heart category refers to emotions. Emotions have notyet been mimicked in machines. It is debatable whether or not they can be replicatedand whether or not replicating them is even beneficial if they could be replicated.

One of the most exciting engineering endeavors involves the effort to mimichuman intelligence. Intelligence implies the ability to comprehend, reason, mem-orize, learn, adapt, and create. It is often said that everybody makes mistakes,but an intelligent person learns from his mistakes and avoids repeating them.This fact of life emphasizes the importance of comprehension, reasoning, learn-ing, and the ability to improve one’s performance autonomously in the definition ofintelligence.

There are essentially two computational systems: carbon-based organic brains,which have existed in humans and animals since their inception, and silicon-basedcomputers, which have rapidly evolved over the latter half of the twentieth centuryand beyond. Technological advances in recent decades have made it possible todevelop computers that are extremely fast and efficient for numerical computations.However, these computers lack the abilities of humans and animals in processingcognitive information acquired by natural sensors. For example, the human brainroutinely performs tasks like recognizing a face in an unfamiliar crowd in 100–200ms whereas a computer can take days to perform a task of lesser complexity. Whilethe information perceived through natural sensors in humans is not numerical, thebrain can process such cognitive information efficiently and cause the human to acton it accordingly. Modern day computers fail miserably in processing such cognitiveinformation.

This leads engineers to wonder if some of the functions and attributes of the hu-man sensory system, cognitive processor, and motor neurons can be emulated in anintelligent system. For such an emulation process, it is necessary to understand thebiological and physiological functions of the brain. Hardware can be developed tomodel aspects of neurons, the principle element of the brain. Similarly, new theoriesand methodologies can be developed to model the human thinking process.

Soft computing refers to a group of methodologies used in the design of intelli-gent systems. Each of these soft computing methodologies can be combined with

Page 256: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 253

other soft computing methodologies to utilize the benefits of each. Soft computingis composed of the constituent methodologies of fuzzy logic, neuro-computing,evolutionary computing, chaotic computing, probabilistic computing, and parts ofmachine learning theory. Fuzzy logic includes fuzzy mathematics, linguistic vari-ables, fuzzy rule bases, computing with words, and the computational theory ofperceptions. Neuro-computing includes static and dynamic neural networks. Evolu-tionary computing includes evolutionary algorithms and genetic algorithms. Chaoticcomputing includes Fractal geometry and chaos theory. Probabilistic computingincludes Bayesian networks and decision analysis. Some of these soft computingmethodologies can be combined to create hybrid soft computing methodologiessuch as neuro-fuzzy systems, fuzzy neural networks, fuzzy genetic algorithms,neuro-genetic algorithms, fuzzy neuro-genetic algorithms, and fuzzy chaos. Thesynergy of different soft computing methodologies has been used in the creationof intelligent systems with greater adaptability, tractability, robustness, a lower costsolution, and better rapport with reality.

Many advances have been made in mimicking human cognitive abilities. Theseadvances were mostly inspired by certain biological aspects of the human brain.One of the intriguing aspects of human perception and cognition is its tolerance forimprecision and uncertainty [1, 2, 3, 4, 5, 6, 7], which characterize most real-worldphenomena.

2 Certainty and Precision

The excess of precision and certainty in engineering and scientific research anddevelopment is often providing unrealizable solutions. Certainty and precision havemuch too often become an absolute standard in design, decision making, and controlproblems. This tradition can be seen in the following statement [8] by Lord Kelvinin 1883:

In physical science, the first essential step in the direction of learning any subject is tofind principles of numerical reckoning and practicable methods for measuring some qualityconnected with it. I often say that when you can measure what you are speaking about andexpress it in numbers, you know something about it; but when you cannot measure it, whenyou cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind:it may be the beginning of knowledge but you have scarcely, in your thoughts, advanced tothe state of science, whatever the matter may be.

One of the fundamental aims in science and engineering has been to move fromperceptions to measurements in observations, analysis, and decision making.

Through the methodology of precise measurements, engineers and scientists havehad many remarkable accomplishments. These include putting people on the moonand returning them safely to Earth, sending spacecraft to the far reaches of thesolar system, sending rovers to explore the surface of Mars, exploring the oceansdepths, designing computers that can perform billions of computations per sec-ond, developing the nuclear bomb, mapping the human genome, and constructing ascanning tunneling microscope that can move individual atoms. However, the path

Page 257: Forging New Frontiers: Fuzzy Pioneers I

254 A. M. G. Solo, M. M. Gupta

of precision, as manifested in the theories of determinism and stochasticism, hasoften caused engineers to be ineffectual and powerless as well as lose scientificcreativity.

3 Uncertainty and Imprecision in Perception and Cognition

The attribute of certainty or precision does not exist in human perception and cog-nition. Alongside many startling achievements using the methodology of precisemeasurements, there have been many abysmal failures that include modeling thebehavior of economic, political, social, physical, and biological systems. Engineershave been unable to develop technology that can decipher sloppy handwriting, rec-ognize oral speech as well as a human can, translate between languages as wellas a human interpreter, drive a car in heavy traffic, walk with the agility of ahuman or animal, replace the combat infantry soldier, determine the veracity ofa statement by a human subject with an acceptable degree of accuracy, replacejudges and juries, summarize a complicated document, and explain poetry or songlyrics.

Underlying these failures is the inability to manipulate imprecise perceptions in-stead of precise measurements in computing methodologies. Albert Einstein wrote,“So far as the laws of mathematics refer to reality, they are not certain. And so faras they are certain, they do not refer to reality [9].”

There are various types of uncertainty. However, they can be classified under twobroad categories [1]: type one uncertainty and type two uncertainty.

3.1 Type One Uncertainty

Type one uncertainty deals with information that arises from the random behav-ior of physical systems. The pervasiveness of this type of uncertainty can be wit-nessed in random vibrations of a machine, random fluctuations of electrons in amagnetic field, diffusion of gases in a thermal field, random electrical activities ofcardiac muscles, uncertain fluctuations in the weather pattern, and turbulent bloodflow through a damaged cardiac valve. Type one uncertainty has been studied forcenturies. Complex statistical mathematics has evolved for the characterization andanalysis of such random phenomena.

3.2 Type Two Uncertainty

Type two uncertainty deals with information or phenomena that arise from hu-man perception and cognition processes or from cognitive information in gen-eral. This subject has received relatively little attention. Perception and cognition

Page 258: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 255

through biological sensors (eyes, ears, nose, etc.), perception of pain, and othersimilar biological events throughout our nervous system and neural networks de-serve special attention. The perception phenomena associated with these processesare characterized by many great uncertainties and cannot be described by conven-tional statistical theory. A person can linguistically express perceptions experiencedthrough the senses, but these perceptions cannot be described using conventionalstatistical theory.

Type two uncertainty and the associated cognitive information involve the activi-ties of neural networks. It may seem strange that such familiar notions have recentlybecome the focus of intense research. However, it is the relative unfamiliarity ofthese notions and their possible technological applications in intelligent systemsthat have led engineers and scientists to conduct research in the field of type twouncertainty and its associated cognitive information.

4 Human Perception and Cognition

The development of the human cognitive process and environmental perceptionstarts taking shape with the development of imaginative ability in a baby’s brain.A baby in a cradle can recognize a human face long before it is conscious of anyvisual physical attributes of humans or the environment. Perception and cognitionare not precise or certain. Imprecision and uncertainty play an important role inthinking and reasoning throughout a person’s life.

For example, when a person is driving a car, his natural sensors (in this case, theeyes and ears) continuously acquire information and relay it through sensory neu-rons to the cognitive processor (brain), which perceives environmental conditionsand stimuli. The cognitive processor synthesizes the control signals and transmitsthem through the motor neurons to the person’s hands and feet, which activate steer-ing, accelerating, and braking mechanisms in the car.

The human brain can instantly perceive and react to a dangerous environmentalstimulus, but it might take several seconds to perform a mathematical calculation.Also, the brain has an incredible ability to manipulate perceptions of distance, size,weight, force, speed, temperature, color, shape, appearance, and innumerable otherenvironmental attributes.

The human brain is an information processing system and neurocontroller thattakes full advantage of parallel distributed hardware, handling thousands of actu-ators (muscle fibers) in parallel, functioning effectively in the presence of noiseand nonlinearity, and capable of optimizing over a long-range planning horizon. Itpossesses robust attributes of distributed sensors and control mechanisms.

The perception process acquires information from the environment through thenatural sensory mechanisms of sight, hearing, touch, smell, and taste. This informa-tion is integrated and interpreted through the cognitive processor called the brain.Then the cognitive process, by means of a complex network of neurons distributedin the central nervous system, goes through learning, recollecting, and reasoning,resulting in appropriate muscular control.

Page 259: Forging New Frontiers: Fuzzy Pioneers I

256 A. M. G. Solo, M. M. Gupta

Silicon-based computers can outperform carbon-based computers like the hu-man brain in numerical computations, but perform poorly in tasks such as imageprocessing and voice recognition. It is widely believed that this is partly because ofthe massive parallelism and tolerance for imprecision and uncertainty embedded inbiological neural processes. Therefore, it can be very beneficial to have a greaterunderstanding of the brain in developing intelligent systems.

5 Computational Perception and Cognition

The computational function of a microcomputer is based on Boolean or binary logic.Boolean logic is unable to model human perception and cognition processes. Formany years, the conventional wisdom espoused only Boolean logic, as expressed byWillard Van Orman Quine in 1970 [10]:

Let us grant, then, that the deviant can coherently challenge our classical true-false di-chotomy. But why should he want to? Reasons over the years have ranged from bad toworse. The worst one is that things are not just black and white; there are graduations.It is hard to believe that this would be seen as counting against classical negation; butirresponsible literature to this effect can be cited.

Bertrand Russell, on the other hand, saw the inherent limitations of traditional logic,as he expressed in 1923 [11]:

All traditional logic habitually assumes that precise symbols are being employed. It is there-fore not applicable to this terrestrial life, but only to an imagined celestial one. The law ofexcluded middle [A OR not-A] is true when precise symbols are employed but it is not truewhen symbols are vague, as, in fact, all symbols are.

Conventional mathematical tools, both deterministic and stochastic, are basedon some absolute measures of information. However, the perception and cognitionactivity of the brain is based on relative grades of information acquired by the humansensory system. Human sensors acquire information in the form of relative gradesrather than absolute numbers. Fresh information acquired from the environmentusing natural sensors and information (experience, knowledge base) is stored inmemory. The action of the cognitive process also appears in the form of relativegrades. For example, a person driving an automobile perceives the environment ina relatively graded sense and acts accordingly. Thus, cognitive processes are basedon graded information.

Conventional mathematical methods can describe objects such as circles, ellipsesand parabolas, but they are unable to describe mountains, lakes, and cloud for-mations. One can estimate the volume of snow, the heights of mountains, or thefrequencies of vibrating musical strings, but conventional mathematics cannot beused to linguistically describe the feelings associated with human perception. Thestudy of such formless uncertainties has led engineers and scientists to contemplatedeveloping morphology for this amorphous uncertainty. In the past, engineers andscientists have disdained this challenge and have chosen to avoid this gap by devis-ing theories unrelated to perception and cognitive processes.

Page 260: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 257

6 Fuzzy Logic

The theory of fuzzy logic [8, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,26, 27, 28] is based on the notion of relative graded membership, as inspired bythe processes of human perception and cognition. Lotfi A. Zadeh published his firstcelebrated paper on fuzzy sets [12] in 1965. Since then, fuzzy logic has proven tobe a very promising tool for dealing with type two uncertainty. Stochastic theory isonly effective in dealing with type one uncertainty.

Fuzzy logic can deal with information arising from computational perceptionand cognition that is uncertain, imprecise, vague, partially true, or without sharpboundaries. Fuzzy logic allows for the inclusion of vague human assessments incomputing problems. Also, it provides an effective means for conflict resolution ofmultiple criteria and better assessment of options. New computing methods basedon fuzzy logic can be used in the development of intelligent systems for decisionmaking, identification, recognition, optimization, and control.

Measurements are crisp numbers, but perceptions are fuzzy numbers or fuzzygranules, which are groups of objects in which there can be partial membership andthe transition of a membership function is gradual, not abrupt. A granule is a groupof objects put together by similarity, proximity, functionality, or indistinguishability.

Fuzzy logic is extremely useful for many people involved in research and de-velopment including engineers (electrical, mechanical, civil, chemical, aerospace,agricultural, biomedical, computer, environmental, geological, industrial, mecha-tronics), mathematicians, computer software developers and researchers, naturalscientists (biology, chemistry, earth science, physics), medical researchers, socialscientists (economics, management, political science, psychology), public policyanalysts, business analysts, jurists, etc. Indeed, the applications of fuzzy logic,once thought to be an obscure mathematical curiosity, can be found in many en-gineering and scientific works. Fuzzy logic has been used in numerous applicationssuch as facial pattern recognition, washing machines, vacuum cleaners, antiskidbraking systems, transmission systems, control of subway systems and unmannedhelicopters, knowledge-based systems for multiobjective optimization of power sys-tems, weather forecasting systems, models for new product pricing or project riskassessment, medical diagnosis and treatment plans, and stock trading. This branch ofmathematics has instilled new life into scientific disciplines that have been dormantfor a long time.

7 Computing with Words and Computational Theoryof Perceptions

Computing has traditionally been focused on the manipulation of numbers andsymbols. On the other hand, the new methodology called computing with wordsin its initial stages of development is focused on the manipulation of words andpropositions taken from natural language. The computational theory of perceptions

Page 261: Forging New Frontiers: Fuzzy Pioneers I

258 A. M. G. Solo, M. M. Gupta

is based on the methodology of computing with words [8, 23, 24, 25]. Computingwith words and the computational theory of perceptions are subsets of fuzzy logic.Words act as the labels for perceptions and perceptions are expressed in naturallanguage propositions.

The methodology of computing with words is used to translate natural languagepropositions into a generalized constraint language. In this generalized constraintlanguage, a proposition is expressed as a generalized constraint X isr R where Xis the constrained variable, R is the constraining relation, and isr is a variable copulain which r is a variable that defines the way X is constrained by R.The basic typesof constraints are possibilistic, probabilistic, veristic, random set, Pawlak set, fuzzygraph, and usuality.

There are four principle reasons for using computing with words in an engineer-ing or scientific problem:

1. The values of variables and/or parameters are not known with sufficient precisionor certainty to justify using conventional computing methods.

2. The problem can’t be solved using conventional computing methods.3. A concept is too difficult to be defined using crisp numerical criteria.4. There is a tolerance for imprecision or uncertainty that can be exploited to

achieve greater adaptability, tractability, robustness, a lower cost solution, orbetter rapport with reality in the development of intelligent systems.

The computing with words methodology is much more effective and versatilefor knowledge representation, perception, and cognition than classical logic systemssuch as predicate logic, propositional logic, modal logic, and artificial intelligencetechniques for knowledge representation and natural language processing. The com-putational theory of perceptions and computing with words will be extremely im-portant branches of fuzzy logic in the future.

8 Examples of Uncertainty Management in ComputationalPerception and Cognition for Linguistic Evaluations

The following examples of uncertainty management in computational perceptionand cognition for linguistic evaluations illustrate applications of fuzzy sets, linguis-tic variables, fuzzy math, the computational theory of perceptions, computing withwords, and fuzzy rule bases.

A database of students’ grades on a test is kept by a professor:

Database = Students [Name, Grade]

The students are assigned a linguistic grade for their test performance. These lin-guistic grades are fail, poor, satisfactory, good, excellent, or outstanding. A crisp setis created for the fail linguistic qualifier and a fuzzy set is created for each of theseremaining linguistic qualifiers, as shown in Fig. 1. Each of these crisp and fuzzy setscorresponds to a range of numeric grades.

Page 262: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 259

Fig. 1 Fuzzy Sets for Test Grades

The fail crisp set is represented by a Z-shaped membership function that assignsa membership value of 1 to all grades below 49.5% and a membership value of 0to all grades above 49.5%. The outstanding fuzzy set is represented by an S-shapedmembership function that assigns a membership value of 0 to grades below 88%, amembership value increasing from 0 at 88% to 1 at 92%, and a membership valueof 1 to grades above 92%.

The poor, satisfactory, good, and excellent fuzzy sets are represented by trape-zoidal membership functions. The poor fuzzy set assigns unity membership from49.5%–58%, membership transitioning from 1 at 58% to 0 at 62%, and 0 mem-bership below 49.5% and above 62%. Membership in the satisfactory fuzzy settransitions from 0 at 58% to 1 at 62%, is 1 from 62% through 68%, and transitionsfrom 1 at 68% to 0 at 72%. Membership in the good fuzzy set transitions from 0at 68% to 1 at 72%, is 1 from 72% through 78%, and transitions from 1 at 78%to 0 at 82%. The excellent fuzzy set assigns unity membership from 82%-88%,membership transitioning from 0 at 78% to 1 at 82%, and membership transitioningfrom 1 at 88% to 0 at 92%.

The absolute values of the slopes is the same for the trailing edge of the poorfuzzy set, the leading edge of the outstanding fuzzy set, and the leading and trailingedges of the satisfactory, good, and excellent fuzzy sets. Thus, membership is notbiased toward one fuzzy set over another and if a student’s test grade has member-ship in two fuzzy sets, the summation of its membership in both of these sets will be1. For example, for a grade of 60%, there is a membership of 0.5 in the poor fuzzyset and 0.5 in the satisfactory fuzzy set. For a grade of 71%, there is a membershipof 0.75 in the good fuzzy set and a membership of 0.25 in the satisfactory fuzzy set.

Linguistic qualifiers by themselves can be restrictive in describing fuzzy vari-ables, so linguistic hedges can be used to supplement linguistic qualifiers throughnumeric transformation. For example, a very hedge can square the initial degree ofmembership µ in a fuzzy set:

µvery good(Student 5) = [µgood(Student 5)]2

Following are some other possible linguistic hedges:

Page 263: Forging New Frontiers: Fuzzy Pioneers I

260 A. M. G. Solo, M. M. Gupta

• somewhat• slightly• more or less• quite• extremely• above• below• fairly• not

A fuzzy rule base can be developed to decide what to do with individual studentsbased on their grades on a test. A sample fuzzy rule base follows:

If a student receives a grade of fail on a test, then phone her parents and giveher additional homework.

If a student receives a grade of poor on a test, then give her additional home-work.

If a student receives a grade of satisfactory or good or excellent on a test, thendo nothing.

If a student receives a grade of outstanding on a test, then give her a reward.

If a student receives a grade that has membership in two different fuzzy sets, thiswill potentially assign two different crisp consequents for dealing with the student.For example, if a student receives a grade of 91%, then this grade has membershipin two fuzzy sets:

1. excellent with a degree of membership of 0.252. outstanding with a degree of membership of 0.75

The maximum (MAX) operator can be used to resolve these conflicting fuzzyantecedents to the fuzzy rule. The MAX operator in fuzzy set theory is a generaliza-tion of the MAX operator in crisp set theory. In the following equation, µ representsthe degree of membership; x represents the domain value; k1, . . . , kn. representfuzzy sets in which there is membership for a domain value of x ; and k representsthe fuzzy set in which there is greatest membership for a domain value of x :

µk = MAX{µk1 (x), . . . , µkn (x)}

Using the MAX operator, the antecedent used in the fuzzy rules above is out-standing with a degree of membership of 0.75. This results in the student getting areward.

Linguistic grades possibly modified by linguistic hedges can be defuzzified usingthe center of gravity defuzzification formula to obtain a single crisp representativevalue. In the following equation for center of gravity defuzzification, µ is the de-gree of membership, x is the domain value, A is the membership function possiblymodified by linguistic hedges, and x0 is the single crisp representative value:

Page 264: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 261

xo =

∫x

µ A(x)xdx∫x

µ A(x)dx

An instructor can assign linguistic grades to students for a series of tests. Each ofthese linguistic grades can be defuzzified using the center of gravity defuzzificationformula, so a crisp overall numerical average can be calculated. Then this crispoverall grade can be refuzzified for an overall linguistic grade.

A numerical average is not always useful in determining overall performance.For example, there could be two critical exams that test two very different criteria,and performance in each of these exams must be considered both individually andtogether in making a decision on an individual’s advancement. In such a case, afuzzy rule base can be developed to evaluate an individual’s performance on differ-ent exams and make a decision on the individual’s advancement.

A sample fuzzy rule base is shown below that evaluates the performance of asoldier on a special forces academic exam and a special forces athletic exam, andthen determines the degree of desirability of that soldier for special forces train-ing. The same fuzzy sets shown in Fig. 1 and described above are used for theantecedents of fuzzy rules in this fuzzy rule base. The fuzzy sets shown in Fig. 2are used for the consequents of fuzzy rules in this fuzzy rule base to describe thedegree of desirability of soldiers for special forces training. The sample fuzzy rulebase follows:

If a soldier receives a grade of fail or poor on the special forces academic examor a grade of fail or poor on the special forces athletic exam, then he isundesirable for special forces training.

If a soldier receives a grade of satisfactory on the special forces academic examor a grade of satisfactory on the special forces athletic exam, then he has lowdesirability for special forces training.

If a soldier receives a grade of good or excellent on the special forces academicexam and a grade of good or excellent on the special forces athletic exam,then he has medium desirability for special forces training.

If a soldier receives a grade of good or excellent on the special forces academicexam and a grade of outstanding on the special forces athletic exam, then hehas high desirability for special forces training.

If a soldier receives a grade of outstanding on the special forces academic examand a grade of good or excellent on the special forces athletic exam, then hehas high desirability for special forces training.

If a soldier receives a grade of outstanding on the special forces academic examand a grade of outstanding on the special forces athletic exam, then accepthim for special forces training.

The overall desirability for special forces training of soldiers in a platoon can bedetermined to evaluate the performance of a platoon leader. To calculate this, thedesirability for special forces training of each individual soldier in the platoon is

Page 265: Forging New Frontiers: Fuzzy Pioneers I

262 A. M. G. Solo, M. M. Gupta

Fig. 2 Fuzzy Sets forDesirability

defuzzified. Then the crisp overall numerical average can be calculated and refuzzi-fied for a single linguistic grade describing the overall desirability for special forcestraining of soldiers in a platoon.

The computational theory of perceptions and computing with words can be usedto form propositions and queries on linguistic grades. A professor keeps the gradesof N students in a database. The name and grade of student i where i = 1 . . . , N isindicated by Namei and Gradei ,respectively.

Consider the following natural language query: What proportion of students re-ceived outstanding grades on the test? The answer to this query is given by the sigmacount [17]:

∑Count

(outstanding. students)

students= 1

N

∑i

µoutstanding(Gradei)

Consider another natural language query: What proportion of students receivedsatisfactory or better grades on the test? The answer to this query is given by thefollowing:

∑Count

(satisfactory. students + good. students + excellent. students+ outstanding. students)

students

= 1

N

∑i

[µsatis f actory(Gradei )+ µgood (Gradei )+ µexcellent (Gradei )

+ µoutstanding(Gradei )]

Consider the natural language proposition: Most students did satisfactory on thetest. The membership function of most is defined as in Fig. 3. This natural languageproposition can be translated to a generalized constraint language proposition:

1

N

∑i

µsat is f actory(Gradei) is most .

Page 266: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 263

Fig. 3 Membership Function of Most

Finally, a fuzzy rule base can be developed to decide whether to make adjust-ments to students’ test grades. A sample fuzzy rule base follows:

If most students receive a grade of fail on the test, then give them a makeup testand take teaching lessons.

If most students receive a grade of fail or poor or satisfactory on the test, thenboost the grades so most students receive a grade of good or excellent oroutstanding on the test.

If most students receive a grade of good or excellent or outstanding on the test,then hand back the tests with these grades.

9 Conclusion

Uncertainty is an inherent phenomenon in the universe and in peoples’ lives. Tosome, it may become a cause of anxiety, but to engineers and scientists it becomesa frontier full of challenges. Engineers and scientists attempt to comprehend thelanguage of this uncertainty through mathematical tools, but these mathematicaltools are still incomplete.

In the past, studies of cognitive uncertainty and cognitive information were hin-dered by the lack of suitable tools for modeling such information. However, fuzzylogic, neural networks, and other methods have made it possible to expand studiesin this field.

Page 267: Forging New Frontiers: Fuzzy Pioneers I

264 A. M. G. Solo, M. M. Gupta

Progress in information technology has significantly broadened the capabilitiesand applications of computers. Today’s computers are merely being used for thestorage and processing of binary information. The functions of these computingtools should be reexamined in view of an increasing interest in intelligent systemsfor solving problems related to decision making, identification, recognition, opti-mization, and control [29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41].

Shannon’s definition of information was based on certain physical measurementsof random activities in systems, particularly communication channels [42]. Thisdefinition of information was restricted only to a class of information arising fromphysical systems. To emulate cognitive functions such as learning, remembering,reasoning, and perceiving in intelligent systems, this definition of information mustbe generalized and new mathematical tools and hardware must be developed thatcan deal with the simulation and processing of cognitive information.

References

1. M. M. Gupta, “Cognition, Perception and Uncertainty,” in Fuzzy Logic in Knowledge-BasedSystems, Decision and Control, M. M. Gupta and T. Yamakawa, eds., New York: North-Holland, 1988, pp. 3–6.

2. M. M. Gupta, “On Cognitive Computing: Perspectives,” in Fuzzy Computting: Theory, Hard-ware, and Applications, M. M. Gupta and T. Yamakawa, eds., New York: North-Holland,1988, pp. 7–10.

3. M. M. Gupta, “Uncertainty and Information: The Emerging Paradigms,” InternationalJournal of Neuro and Mass-Parallel Computing and Information Systems, vol. 2, 1991,pp. 65–70.

4. M. M. Gupta, “Intelligence, Uncertainty and Information,” in Analysis and Management ofUncertainty: Theory and Applications, B. M. Ayyub, M. M. Gupta, and L. N. Kanal, eds.,New York: North-Holland, 1992, pp. 3–12.

5. G. J. Klir, “Where Do We Stand on Measures of Uncertainty, Ambiguity, Fuzziness andthe Like,” Fuzzy Sets and Systems, Special Issue on Measure of Uncertainty, vol. 24, no. 2,November 1987, pp. 141–160.

6. G. J. Klir, “The Many Faces of Uncertainty,” in Uncertainty Modelling and Analysis: Theoryand Applications, B. M. Ayyub and M. M. Gupta, eds., New York: North-Holland, 1994,pp. 3–19.

7. M. Black, “Vagueness: An Exercise in Logical Analysis,” Philosophy of Science, no. 4, 1937,pp. 427–455.

8. L. A. Zadeh, “From Computing with Numbers to Computing with Words—From Manipula-tion of Measurements to Manipulation of Perceptions,” IEEE Transactions on Circuits andSystems—I: Fundamental Theory and Applications, vol. 46, no. 1, 1999, pp. 105–119.

9. A. Einstein, “Geometry and Experience,” in The Principle of Relativity: A Collection of Orig-inal Papers on the Special and General Theory of Relativity, New York: Dover, 1952.

10. W. V. O. Quine, Philosophy of Logic, Englewood Cliffs, NJ: Prentice Hall, 1970.11. B. Russell, “Vagueness,” Australian Journal of Philosophy, vol. 1, 1923.12. L. A. Zadeh, “Fuzzy Sets,” Information and Control, vol. 8, 1965, pp. 338–353.13. L. A. Zadeh, “A fuzzy-set-theoretic interpretation of linguistic hedges,” Journal of Cybernet-

ics, vol. 2, 1972, pp. 4–34.14. L. A. Zadeh, “Outline of a new approach to the analysis of complex system and deci-

sion processes,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 3, pp. 28–44.1973.

Page 268: Forging New Frontiers: Fuzzy Pioneers I

Uncertainty in Computational Perception and Cognition 265

15. L. A. Zadeh, “Calculus of fuzzy restrictions,” in Fuzzy Sets and Their Applications to Cog-nitive and Decision Processes, L. A. Zadeh, K. S. Fu, and M. Shimura, eds., New York:Academic, 1975, pp. 1–39.

16. L. A. Zadeh, “The concept of a linguistic variable and its application to approximate reason-ing,” Part I: Information Science, vol. 8, pp. 199–249. Part II: Information Science, vol. 8,pp. 301–357, Part III: Information Science, vol. 9, pp. 43–80, 1975.

17. L. A. Zadeh, “PRUF—A meaning representation language for natural languages,” Interna-tional Journal of Man-Machine Studies, vol. 10, 1978, pp. 395–460.

18. L. A. Zadeh, “Fuzzy sets and information granularity,” in Advances in Fuzzy Set Theory andApplications, M. M. Gupta, R. Ragade, and R. Yager, eds., Amsterdam: North-Holland, 1979,pp. 3-18.

19. L. A. Zadeh, “A theory of approximate reasoning,” Machine Intelligence, vol. 9, J. Hayes,D. Michie, and L. I. Mikulich, eds., New York: Halstead, 1979, pp. 149–194.

20. L. A. Zadeh, “Outline of a computational approach to meaning and knowledge rep-resentation based on the concept of a generalized assignment statement,” Proceedingsof the International Seminar on Artificial Intelligence and Man-Machine Systems, 1986,pp. 198–211.

21. L. A. Zadeh, “Fuzzy logic, neural networks, and soft computing,” Communications of theACM, vol. 37, no. 3, 1994, pp. 77–84.

22. L.A. Zadeh, Fuzzy logic and the calculi of fuzzy rules and fuzzy graphs: A précis,” in MultipleValued Logic 1, Gordon and Breach Science, 1996, pp. 1–38.

23. L. A. Zadeh, “Fuzzy Logic = Computing with Words,” IEEE Transactions on Fuzzy Systems,vol. 4, 1996, pp. 103-111.

24. L. A. Zadeh, “Toward a theory of fuzzy information granulation and its centrality in humanreasoning and fuzzy logic,” Fuzzy Sets and Systems, vol. 90, pp. 111–127. 1997.

25. L. A. Zadeh, “Outline of a Computational Theory of Perceptions Based on Computing withWords,” in Soft Computing & Intelligent Systems, N. K. Sinha and M. M. Gupta, eds., NewYork: Academic, 2000, pp. 3–22.

26. B. Kosko, Fuzzy Thinking: The New Science of Fuzzy Logic, New York: Hyperion, 1993.27. A. Kaufmann and M. M. Gupta, Introduction to Fuzzy Arithmetic: Theory and Applications,

New York: Van Nostrand Reinhold, 1985.28. A. Kaufmann and M. M. Gupta, Fuzzy Mathematical Models in Engineering and Manage-

ment Science, Amsterdam: North-Holland, 1988.29. R. J. Sarfi and A. M. G. Solo, “The Application of Fuzzy Logic in a Hybrid Fuzzy Knowledge-

Based System for Multiobjective Optimization of Power Distribution System Operations,”Proceedings of the 2005 International Conference on Information and Knowledge Engi-neerin. (IKE’05), pp. 3–9.

30. M. M. Gupta , G. N. Saridis, and B. R. Gaines, eds., Fuzzy Automata and Decision Processes,Elsevier North-Holland, 1977.

31. M. M. Gupta and E. Sanchez, eds., Approximate Reasoning in Decision Analysis, North-Holland, 1982.

32. M. M. Gupta and E. Sanchez, eds., Fuzzy Information and Decision Processes, North-Holland, 1983.

33. M. M. Gupta, A. Kandel, W. Bandler, and J. B. Kiszka, eds., Approximate Reasoning in ExpertSystems, North-Holland, 1985.

34. S. Mitra, M. M. Gupta, and W. Kraske, eds., Neural and Fuzzy Systems: The Emerging Scienceof Intelligent Computing, International Society for Optical Computing (SPIE), 1994.

35. M. M. Gupta and D. H. Rao, eds., Neuro-Control Systems: Theory and Applications, Piscat-away, NJ: IEEE, 1994.

36. M. M. Gupta and G. K. Knopf, eds., Neuro-Vision Systems: Principles and Applications,Piscataway, NJ: IEEE, 1994.

37. H. Li and M. M. Gupta, eds., Fuzzy Logic and Intelligent Systems, Boston: Kluwer Academic,1995.

38. B. M. Ayyub and M. M. Gupta, eds., Uncertainty Analysis in Engineering and Sciences:Fuzzy Logic, Statistics and Neural Networks Approach, Boston: Kluwer Academic, 1997.

Page 269: Forging New Frontiers: Fuzzy Pioneers I

266 A. M. G. Solo, M. M. Gupta

39. M. M. Gupta and N. K. Sinha, eds., Intelligent Control Systems: Theory and Applications,Piscataway, NJ: IEEE, 1995.

40. N. K. Sinha and M. M. Gupta, eds., Soft Computing and Intelligent Systems, New York:Academic, 2000.

41. M. M. Gupta, L. Jin, and N. Homma, Static and Dynamic Neural Networks: From Funda-mentals to Advanced Theory, Hoboken, NJ: John Wiley & Sons, 2003, pp. 633–686.

42. F. M. Reza, An Introduction to Information Theory, New York: McGraw-Hill, 1961, p. 10.

Page 270: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciencesand Oil Exploration

Masoud Nikravesh

Abstract In this overview paper, we highlight role of Soft Computing techniquesfor intelligent reservoir characterization and exploration, seismic data processingand characterization, well logging, reservoir mapping and engineering. Reservoircharacterization plays a crucial role in modern reservoir management. It helps tomake sound reservoir decisions and improves the asset value of the oil and gascompanies. It maximizes integration of multi-disciplinary data and knowledge andimproves the reliability of the reservoir predictions. The ultimate product is a reser-voir model with realistic tolerance for imprecision and uncertainty. Soft computingaims to exploit such a tolerance for solving practical problems. In reservoir char-acterization, these intelligent techniques can be used for uncertainty analysis, riskassessment, data fusion and data mining which are applicable to feature extractionfrom seismic attributes, well logging, reservoir mapping and engineering. The maingoal is to integrate soft data such as geological data with hard data such as 3Dseismic and production data to build a reservoir and stratigraphic model. Whilesome individual methodologies (esp. neurocomputing) have gained much popularityduring the past few years, the true benefit of soft computing lies on the integrationof its constituent methodologies rather than use in isolation.

1 Introduction

This paper reviews the recent geosciences applications of soft computing (SC) withspecial emphasis on exploration. The role of soft computing as an effective methodof data fusion will be highlighted. SC is consortium of computing methodologies(Fuzzy Logic (GL), Neuro-Computing (NC), Genetic Computing (GC), and Proba-bilistic Reasoning (PR) including; Genetic Algorithms (GA), Chaotic Systems (CS),Belief Networks (BN), Learning Theory (LT)) which collectively provide a founda-tion for the Conception, Design and Deployment of Intelligent Systems. The role

Masoud NikraveshBerkeley Initiative in Soft Computing (BISC) rogram, Computer Science Division- Department ofEECS, University of California, Berkeley, CA 94720e-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 267C© Springer 2007

Page 271: Forging New Frontiers: Fuzzy Pioneers I

268 M. Nikravesh

model for Soft Computing is the Human Mind. Unlike the conventional or hard com-puting, it is tolerant of imprecision, uncertainty and partial truth. It is also tractable,robust, efficient and inexpensive. Among main components of soft computing, theartificial neural networks, fuzzy logic and the genetic algorithms in the “explorationdomain” will be examined. Specifically, the earth exploration applications of SC invarious aspects will be discussed. We outlines the unique roles of the three majormethodologies of soft computing – neurocomputing, fuzzy logic and evolutionarycomputing. We will summarize a number of relevant and documented reservoircharacterization applications. We will also provide a list of recommendations forthe future use of soft computing. This includes the hybrid of various methodologies(e.g. neural-fuzzy or neuro-fuzzy, neural-genetic, fuzzy-genetic and neural-fuzzy-genetic) and the latest tool of “computing with words” (CW) (Zadeh 1996). CWprovides a completely new insight into computing with imprecise, qualitative andlinguistic phrases and is a potential tool for geological modeling which is based onwords rather than exact numbers.

2 The Role of Soft Computing Techniques for IntelligentReservoir Characterization and Exploration

Soft computing is bound to play a key role in the earth sciences. This is in partdue to subject nature of the rules governing many physical phenomena in the earthsciences. The uncertainty associated with the data, the immense size of the data todeal with and the diversity of the data type and the associated scales are impor-tant factors to rely on unconventional mathematical tools such as soft computing.Many of these issues are addressed in a recent books, Nikravesh et al. (2002),Wong et al (2001), recent special issues, Nikravesh et al. (2001a and 2001b) andWong and Nikravesh (2001) and other publications such as Zadeh (1994), Zadehand Aminzadeh (1995), and Aminzadeh and Jamshidi (1994).

Recent applications of soft computing techniques have already begun to enhanceour ability in discovering new reserves and assist in improved reservoir manage-ment and production optimization. This technology has also been proven useful inproduction from low permeability and fractured reservoirs such as fractured shale,fractured tight gas reservoirs and reservoirs in deep water or below salt which con-tain major portions of future oil and gas resources.

Intelligent techniques such as neural computing, fuzzy reasoning, and evolution-ary computing for data analysis and interpretation are an increasingly powerful toolfor making breakthroughs in the science and engineering fields by transforming thedata into information and information into knowledge.

In the oil and gas industry, these intelligent techniques can be used for uncertaintyanalysis, risk assessment, data fusion and mining, data analysis and interpretation,and knowledge discovery, from diverse data such as 3-D seismic, geological data,well log, and production data. In addition, these techniques can be a key to costeffectively locating and producing our remaining oil and gas reserves. Techniquescan be used as a tool for:

Page 272: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 269

1) Lowering Exploration Risk2) Reducing Exploration and Production cost3) Improving recovery through more efficient production4) Extending the life of producing wells.

In what follows we will address data processing / fusion / mining, first. Then, wewill discuss interpretation, pattern recognition and intelligent data analysis.

2.1 Mining and Fusion of Data

In recent years in the oil industry we have witnessed massive explosion in the datavolume we have to deal with. As outlined at, Aminzadeh, 1996, this is caused by in-creased sampling rate, larger offset and longer record acquisition, multi-componentsurveys, 4-D seismic and, most recently, the possibility of continuous recording in“instrumented oil fields”. Thus we need efficient techniques to process such largedata volumes. Automated techniques to refine the data (trace editing and filtering),selecting the desired event types (first break picking) or automated interpretation(horizon tracking) are needed for large data volumes. Fuzzy logic and neural net-works have been proven to be effective tools for such applications. To make use oflarge volumes of the field data and multitude of associated data volumes (e.g. differ-ent attribute volumes or partial stack or angle gathers), effective data compressionmethods will be of increasing significance, both for fast data transmission efficientprocessing, analysis and visualization and economical data storage. Most likely, thebiggest impact of advances in data compression techniques will be realized whengeoscientists have the ability to fully process and analyze data in the compressed do-main. This will make it possible to carry out computer-intensive processing of largevolumes of data in a fraction of the time, resulting in tremendous cost reductions.Data mining is another alternative that helps identify the most information rich partof the large volumes of data. Again in many recent reports, it has been demonstratedthat neural networks and fuzzy logic, in combination of some of the more conven-tional methods such as eigenvalue or principal component analysis are very useful.

Figure 1 shows the relationship between Intelligent Technology and Data Fu-sion/Data Mining. Tables 1 and 2 show the list of the Data Fusion and Data Miningtechniques. Figure 2 and Table 3 show the Reservoir Data Mining and ReservoirData Fusion concepts and techniques. Table 4 shows the comparison between Geo-statistical and Intelligent techniques. In sections II through VIII, we will highlightsome of the recent applications of these methods in various earth sciences disci-plines.

2.2 Intelligent Interpretation and Data Analysis

For detailed review of various applications of soft computing in intelligent inter-pretation, data analysis and pattern recognition see recent books, Nikravesh et al.

Page 273: Forging New Frontiers: Fuzzy Pioneers I

270 M. Nikravesh

Fig. 1 Intelligent Technology Intelligent Technologies

Data Ming

Data F

usion

I

II

III

I : Conventional interpretation

II : Conventional integration

III: Intelligent characterization

(2002), Wong et al (2001), recent special issues, Nikravesh et al. (2001a and 2001b)and Wong and Nikravesh (2001) and other publications such as Aminzadeh (1989),Aminzadeh (1991) and Aminzadeh and Jamshidi (1995).

Although seismic signal processing has advanced tremendously over the lastfour decades, the fundamental assumption of a “convolution model” is violated inmany practical settings. Sven Treitel, in Aminzadeh (1995) was quoted to pose thequestion: What if, mother earth refuses to convolve? Among such situations are:

Table 1 Data Mining Techniques

• Deductive Database Client

• Inductive Learning

• Clustering

• Case-based Reasoning

• Visualization

• Statistical Package

Table 2 Data Fusion Techniques

• Deterministic

– Transform based (projections,. . .)– Functional evaluation based (vector quantization,. . .)– Correlation based (pattern match, if/then productions)– Optimization based (gradient-based, feedback, LDP,. . .)

• Non-deterministic

– Hypothesis testing (classification,. . .)– Statistical estimation (Maximum likelihood,. . .)– Discrimination function (linear aggregation,. . .)– Neural network (supervised learning, clusterning,. . .)– Fuzzy Logic (Fuzzy c-Mean Clustering,. . .)

• Hybrid (Genetic algorithms, Bayesian network,. . .)

Page 274: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 271

Fig. 2 Reservoir DataMining

Reservoir Data Mining

Geological/Stratigraphic Seismic Well Log Core Test

Rockphysical/Geostatistical/Intelligent Algorithms

Formation Characters

Reservoir Properties

Seismic Attributes

Table 3 Reservoir Data Fusion• Rockphysical

– transform seismic data to attributes and reservoir properties– formulate seismic/log/core data to reservoir properties

• Geostatistical

– transform seismic attributes to formation characters– transform seismic attributes to reservoir properties– simulate the 2D/3D distribution of seismic and log attributes

• Intelligent

– clustering anomalies in seismic/log data and attributes– ANN layers for seismic attribute and formation characters– supervised training model to predict unknown from existing– hybrid such as GA and SA for complicated reservoirs

Table 4 Geostatistical vs. Intelligent

• Geostatistical

– Data assumption: a certain probability distribution– Model: weight functions come from variogram trend, strati-

graphic facies, and probability constraints– Simulation: Stochastic, not optimized

• Intelligent

– Data automatic clustering and expert-guided segmentation– Classification of relationship between data and targets– Model: weight functions come from supervised training based on

geological and stratigraphic information– Simulation: optimized by GA, Sa, ANN, and BN

Page 275: Forging New Frontiers: Fuzzy Pioneers I

272 M. Nikravesh

highly heterogeneous environments, very absorptive media (such as unconsolidatedsand and young sediments), fractured reservoirs, and mud volcano, karst and gaschimneys. In such cases we must consider non-linear processing and interpreta-tion methods. Neural networks fractals, fuzzy logic, genetic algorithms, chaos andcomplexity theory are among such non-linear processing and analysis techniquesthat have been proven to be effective. The highly heterogeneous earth model thatgeophysics attempts to quantify is an ideal place for applying these concepts. Thesubsurface lives in a hyper-dimensional space (the properties can be considered asthe additional space dimension), but its actual response to external stimuli initi-ates an internal coarse-grain and self-organization that results in a low-dimensionalstructured behavior. Fuzzy logic and other non-linear methods can describe shapesand structures generated by chaos. These techniques will push the boundaries ofseismic resolution, allowing smaller-scale anomalies to be characterized.

2.3 Pattern Recognition

Due to recent advances in computer systems and technology, artificial neuralnetworks and fuzzy logic models have been used in many pattern recognition ap-plications ranging from simple character recognition, interpolation, and extrapo-lation between specific patterns to the most sophisticated robotic applications. Torecognize a pattern, one can use the standard multi-layer perceptron with a back-propagation learning algorithm or simpler models such as self-organizing networks(Kohonen, 1997) or fuzzy c-means techniques (Bezdek et al., 1981; Jang andGulley, 1995). Self-organizing networks and fuzzy c-means techniques can easilylearn to recognize the topology, patterns, or seismic objects and their distribution ina specific set of information.

2.4 Clustering

Cluster analysis encompasses a number of different classification algorithms thatcan be used to organize observed data into meaningful structures. For example,k-means is an algorithm to assign a specific number of centers, k, to represent theclustering of N points (k<N). These points are iteratively adjusted so that each pointis assigned to one cluster, and the centroid of each cluster is the mean of its assignedpoints. In general, the k-means technique will produce exactly k different clustersof the greatest possible distinction.

Alternatively, fuzzy techniques can be used as a method for clustering. Fuzzyclustering partitions a data set into fuzzy clusters such that each data point can be-long to multiple clusters. Fuzzy c-means (FCM) is a well-known fuzzy clusteringtechnique that generalizes the classical (hard) c-means algorithm and can be usedwhere it is unclear how many clusters there should be for a given set of data.

Page 276: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 273

Subtractive clustering is a fast, one-pass algorithm for estimating the number ofclusters and the cluster centers in a set of data. The cluster estimates obtained fromsubtractive clustering can be used to initialize iterative optimization-based cluster-ing methods and model identification methods.

In addition, the self-organizing map technique known as Kohonen’s self-organizing feature map (Kohonen, 1997) can be used as an alternative for clus-tering purposes. This technique converts patterns of arbitrary dimensionality (thepattern space) into the response of one- or two-dimensional arrays of neurons (thefeature space). This unsupervised learning model can discover any relationship ofinterest such as patterns, features, correlations, or regularities in the input data, andtranslate the discovered relationship into outputs. The first application of clusteringtechniques to combine different seismic attributes was introduced in the mid eighties(Aminzadeh and Chatterjee, 1984)

2.5 Data Integration and Reservoir Property Estimation

Historically, the link between reservoir properties and seismic and log data havebeen established either through “statistics-based” or “physics-based” approaches.The latter, also known as model based approaches attempt to exploit the changesin seismic character or seismic attribute to a given reservoir property, based onphysical phenomena. Here, the key issues are sensitivity and uniqueness. Statisticsbased methods attempt to establish a heuristic relationship between seismic mea-surements and prediction values from examination of data only. It can be arguedthat a hybrid method, combining the strength of statistics and physics based methodwould be most effective. Figure 3, taken from Aminzadeh, 1999 shows the conceptsschematically.

Properties

Statistical Methods(Regression, clustering, cross plot,

kriging, co-kriging, ANN….)

Data

Physical Methods(Rock meas., synthetic modeling,bright spot, Q, Biot-Gassman….)

Seismic, log,Uncertainty

Fig. 3 A schematic description of physics-based (blue), statistics-based (red) and hybrid method(green)

Page 277: Forging New Frontiers: Fuzzy Pioneers I

274 M. Nikravesh

Many geophysical analysis methods and consequently seismic attributes arebased on physical phenomena. That is, based on certain theoretical physics (wavepropagation, Biot-Gassman Equation, Zoeppritz Equation, tuning thickness, shearwave splitting, etc.) certain attributes may be more sensitive to changes in certainreservoir properties. In the absence of a theory, using experimental physics (for ex-ample rock property measurements in a laboratory environment such as the onedescribed in the last section of this paper) and/or numerical modeling, one canidentify or validate suspected relationships. Although physics-based methods anddirect measurements (the ground truth) is the ideal and reliable way to establishsuch correlations, for various reasons it is not always practical. Those reasons rangefrom lack of known theories, difference between the laboratory environment andfield environment (noise, scale, etc.) and the cost for conducting elaborate physicalexperiments.

Statistics-based methods aim at deriving an explicit or implicit heuristic rela-tionship between measured values and properties to be predicted. Neural-networksand fuzzy-neural networks-based methods are ideally suitable to establish such im-plicit relationships through proper training. We all attempt to establish a relation-ship between different seismic attributes, petrophysical measurements, laboratorymeasurements and different reservoir properties. In such statistics-based methodone has keep in mind the impact of noise on the data, data population used forstatistical analysis, scale, geologic environment, scale and the correlation betweendifferent attributes when performing clustering or regressions. The statistics-basedconclusions have to be reexamined and their physical significance explored.

2.6 Quantification of Data Uncertainty and Prediction Errorand Confidence Interval

One of the main problems we face is to handle non-uniqueness issue and quantifyuncertainty and confidence intervals in our analysis. We also need to understand theincremental improvements in prediction error and confidence range from introduc-tion of new data or a new analysis scheme. Methods such as evidential reasoning andfuzzy logic are most suited for this purpose. Figure 4 shows the distinction betweenconventional probability and theses techniques. “Point probability,” describes theprobability of an event, for example, having a commercial reservoir. The implica-tion is we know exactly what this probability is. Evidential reasoning, provides anupper bound (plausibility) and lower bound (credibility) for the event the differencebetween the two bounds is considered as the ignorance range. Our objective is toreduce this range through use of all the new information. Given the fact that in reallife we may have non-rigid boundaries for the upper and lower bounds and we rampup or ramp down our confidence for an event at some point, we introduce fuzzylogic to handle and we refer to it as “membership grade”. Next-generation earthmodeling will incorporate quantitative representations of geological processes andstratigraphic / structural variability. Uncertainty will be quantified and built into themodels.

Page 278: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 275

.5.2 .3

Probabilityμ, M

embe

rshi

p G

rade Credibility

Plausibility

Ignorance Range

(Confidence Interval)

Point Probability

Fig. 4 Point probability, evidential reasoning and fuzzy logic

On the issue of non-uniqueness, the more sensitive the particular seismic char-acter to a given change to reservoir property, the easier to predict it. The moreunique influence of the change in seismic character to changes in a specific reservoirproperty, the higher the confidence level in such predictions. Fuzzy logic can han-dle subtle changes in the impact of different reservoir properties on the waveletresponse. Moreover comparison of multitude of wavelet responses (for examplenear, mid and far offset wavelets) is easier through use of neural networks. Asdiscussed in Aminzadeh and de Groot, 2001, let us assume a seismic pattern forthree different lithologies (sand, shaly sand and shale) are compared from differentwell information and seismic response (both model and field data) and the respec-tive seismic character within the time window or the reservoir interval with four“classes” of wavelets, (w1, w2, w3 and w4). These 4 wavelets (basis wavelets)serve as a segmentation vehicle. The histograms in Fig. 5a show what classes ofwavelets that are likely to be present for given lithologies. In the extreme pos-itive (EP) case we would have one wavelet uniquely representing one lithology.In the extreme negative case (EN) we would have a uniform distribution of allwavelets for all lithologies. In most cases unfortunately we are closer to NP thanto EP. The question is how best we can get these distributions move from theEN side to EP side thus improving our prediction capability and increasing con-fidence level. The common sense is to add enhance information content of theinput data.

How about if we use wavelet vectors comprised of pre-stack data (in the sim-ple case, mid, near far offset data) as the input to a neural network to perform theclassification? Intuitively, this should lead to a better separation of different litholo-gies (or other reservoir properties). Likewise, including three component data asthe input to the classification process would further improve the confidence level.Naturally, this requires introduction of a new “metric” measuring “the similarity”of these “wavelet vectors”. This can be done using the new basis wavelet vectorsas input to a neural network applying different weights to mid, near and far offsettraces. This is demonstrated conceptually, in Fig. 5 to predict lithology. Comparethe sharper histograms of the vector wavelet classification (in this case, mid, near,and far offset gathers) in Fig. 5b, against those of Fig. 5a based on scalar waveletclassification.

Page 279: Forging New Frontiers: Fuzzy Pioneers I

276 M. Nikravesh

Fig. 5 Statistical distribution of different wavelet types versus lithologies, A, -pre-stack data. B,stacked data

3 Artificial Neural Network and Geoscience Applicationsof ANN for Exploration

Although Artificial neural networks (ANN) were introduced in the late fifties(Rosenblatt, 1962), the interests in them have been increasingly growing in recentyears. This has been in part due to new applications fields in the academia andindustry. Also, advances in computer technology (both hardware and software) havemade it possible to develop ANN capable of tackling practically meaningful prob-lems with a reasonable response time. Simply put, neural networks are computermodels that attempt to simulate specific functions of human nervous system. Thisis accomplished through some parallel structures comprised of non-linear process-ing nodes that are connected by fixed (Lippmann, 1987), variable (Barhen et al.,1989) or fuzzy (Gupta and Ding, 1994) weights. These weights establish a rela-tionship between the inputs and output of each “Neuron” in the ANN. UsuallyANN have several “hidden” layers each layer comprised of several neurons. If thefeed-forward (FF) network (FF or concurrent networks are those with unidirectionaldata flow). The full technical details can be found in Bishop (Bishop, 1995). If theFF network is trained by back propagation (BP) algorithms, they are called BP.Other types of ANN are supervised (self organizing) and auto (hetro) associativenetworks.

Neurocomputing represents general computation with the use of artificial neuralnetworks. An artificial neural network is a computer model that attempts to mimicsimple biological learning processes and simulate specific functions of human ner-vous system. It is an adaptive, parallel information processing system which is ableto develop associations, transformations or mappings between objects or data. It isalso the most popular intelligent technique for pattern recognition to date.

The major applications of neurocomputing are seismic data processing and inter-pretation, well logging and reservoir mapping and engineering.

Good quality seismic data is essential for realistic delineation of reservoir struc-tures. Seismic data quality depends largely on the efficiency of data processing. Theprocessing step is time consuming and complex. The major applications include firstarrival picking, noise elimination, structural mapping, horizon picking and eventtracking. A detailed review can be found in Nikravesh and Aminzadeh (1999).

Page 280: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 277

For interwell characterization, neural networks have been used to derive reservoirproperties by crosswell seismic data. In Chawathé et al. (1997), the authors useda neural network to relate five seismic attributes (amplitude, reflection strength,phase, frequency and quadrature) to gamma ray (GR) logs obtained at two wellsin the Sulimar Queen field (Chaves County). Then the GR response was predictedbetween the wells and was subsequently converted to porosity based on a field-specific porosity-GR transform. The results provided good delineation of variouslithofacies.

Feature extraction from 3D seismic attributes is an extremely important area.Most statistical methods are failed due to the inherent complexity and nonlinearinformation content. Figure 3 shows an example use of neural networks for segment-ing seismic characters thus deducing information on the seismic facies and reservoirproperties (lithology, porosity, fluid saturation and sand thickness). A display of thelevel of confidence (degree of match) between the seismic character at a given pointversus the representative wavelets (centers of clusters) is also shown. Combiningthis information with the seismic model derived from the well logs while perturbingfor different properties gives physical meaning of different clusters.

Monson and Pita (1997) applied neural networks to find relationships between3D seismic attributes and well logs. The study provided realistic prediction of logresponses far away from the wellbore. Boadu (1997) also used similar technologyto relate seismic attributes to rock properties for sandstones. In Nikravesh et al.7,the author applied a combination of k-means clustering, neural networks and fuzzyc-means (1984) (a clustering algorithm in which each data vector belongs to each ofthe clusters to a degree specified by a membership grade) techniques to characterizea field that produces from the Ellenburger Dolomite. The techniques were used toperform clustering of 3D seismic attributes and to establish relationships betweenthe clusters and the production log. The production log was established away fromwellbore. The production log and the clusters were then superimposed at each pointof a 3D seismic cube. They also identified the optimum locations for new wellsbased on the connectivity, size and shape of the clusters related to the pay zones.

The use of neural networks in well logging has been popular for nearly onedecade. Many successful applications have been documented (Wong et al., 1998;Brus et al., (1999); Wong et al., 2000). The most recent work by Bruce et al. (2000)presented a state-of-the-art review of the use of neural networks for predicting per-meability from well logs. In this application, the network is used as a nonlinearregression tool to develop transformation between well logs and core permeability.Such a transformation can be used for estimating permeability in uncored intervalsand wells. In this work, the permeability profile was predicted by a Bayesian neuralnetwork. The network was trained by a training set with four well logs (GR, NPHI,RHOB and RT) and core permeability. The network also provided a measure ofconfidence (the standard deviation of a Gaussian function): the higher the standarddeviation (“sigma”), the lower the prediction reliability. This is very useful for un-derstanding the risk of data extrapolation. The same tool can be applied to estimateporosity and fluid saturations. Another important application is the clustering of welllogs for the recognition of lithofacies (Rogers et al., 1992). This provides usefulinformation for improved petrophysical estimates and well correlation.

Page 281: Forging New Frontiers: Fuzzy Pioneers I

278 M. Nikravesh

Neurocomputing has also been applied to reservoir mapping. In Wong et al. (1997)and Wang et al. (1998, 1999, 1999), the authors applied a radial basis function neuralnetwork to relate the conceptual distribution of geological facies (in the form of handdrawings) to reservoir porosity. It is able to incorporate the general property trendprovided by local geological knowledge and to simulate fine-scaled details whenused in conjunction with geostatistical simulation techniques. In Caers (1999) andCaers and Journel (1998), the authors trained a neural network to recognize the localconditional probability based on multiple-point information retrieved from a “trainingimage,” which can be any densely populated image (e.g. outcrop data, photographs,hand drawings, seismics, etc.). The conditional probability was used in stochasticsimulation with a Markov Chain sampler (e.g. Markov Chain Monte Carlo). Thesemethodologies can be applied to produce 3D model of petrophysical properties us-ing multiple seismic attributes and conceptual geological maps. This is a significantadvantage compared to the conventional geostatistical methods which are limited totwo-point statistics (e.g. variograms) and simple objects (e.g. channels).

In Nikravesh et al. (1996), the authors used neural networks to predict field per-formance and optimize oil recovery by water injection in the Lost Hill Diatomite(Kern County). They constructed several neural networks to model individual wellbehavior (wellhead pressure and injection-production history) based on data ob-tained from the surrounding wells. The trained networks were used to predict futurefluid production. The results matched well with the actual data. The study also ledto the best oil recovery with the minimum water injected.

In what follows we will review the geoscience applications in these broad areas:Data Processing and Prediction. We will not address other geoscience applicationssuch as: classification of multi- source remote sensing data Bennediktson et al.,(1990), earthquake prediction, Aminzadeh et al. (1994), and ground water remedia-tion, Johnson and Rogers (1975).

3.1 Data Processing

Various types of geoscience data are used in the oil industry to ultimately lo-cate the most prospective locations for oil and gas reservoirs. These data sets gothrough extensive amount of processing and manipulation before they are analyzedand interpreted. The processing step is very time consuming yet a very impor-tant one. ANN have been utilized to help improve the efficiency of operation inthis step. Under this application area we will examine: First seismic arrival pick-ing, and noise elimination problems. Also, see Aminzadeh (1991) and McCor-mack (1991) and Zadeh Aminzadeh (1995) and Aminzadeh et al. (1999) for otherrelated applications.

3.1.1 First Arrival Picking

Seismic data are the response of the earth to any disturbance (compressional wavesor shear waves). The seismic source can be generated either artificially (petroleum

Page 282: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 279

seismology, PS) or, naturally, (earthquake seismology, ES). The recorded seismicdata are then processed and analyzed to make an assessment of the subsurface (boththe geological structures and rock properties) in PS and the nature of the source(location or epicenter and magnitude, for example, in Richter scale) in ES. Con-ventional PS relies heavily on compressional (P- wave) data while ES is essentiallybased on the shear (S- wave) data.

The first arrivals of P and S waves on a seismic record contain useful informationboth in PS and ES. However one should make sure that the arrival is truly associatedwith a seismically generated event not a noise generated due to various factors.Since we usually deal with thousands of seismic records, their visual inspection fordistinguishing FSA from noise, even if reliable, could be quite time consuming.

One of the first geoscience applications of ANN has been to streamline the oper-ation of identifying the FSA in an efficient and reliable manner. Among the recentpublications in this area are: McCormack (1990) and Veezhinathan et al. (1991).Key elements of the latter (V91) are outlined below:

Here, the FSA picking is treated as a pattern recognition problem. Each event isclassified either as an FSA or non-FSA. A segment of the data within a window isused to obtain four “Hilbert” attributes of the seismic signal. The Hilbert attributesof seismic data were introduced by Taner et al. (1979). In V91, these attributesare derived from seismic signal using a sliding time window. Those attributes are:1) Maximum Amplitude, 2) Mean Power Level; MPL, 3) Power Ratios, and 4)Envelop Slop Peak. These types of attributes have been used by Aminzadeh andChatterjee (1984) for predicting gas sands using clustering and discernment analysistechnique.

In V91, the network processes three adjacent peaks at a time to decide whetherthe center peak is an FSA or a non-FSA. A BPN with five hidden layers combinedwith a post processing scheme accomplished correct picks of 97%. Adding a fifthattribute, Distance from Travel Time Curve, generated satisfactory results withoutthe need for the post processing step.

McCormack (1990) created a binary image from the data and used it to trainthe network to move up and down across the seismic record to identify the FSA.This image-based approach captures space-time information in the data but requiresa large number of input units, thus necessitating a large network. Some empiricalschemes are used to ensure its stability.

3.1.2 Noise Elimination

A related problem to FSA is editing noise from the seismic record. The objectivehere is to identify events with non-seismic origin (the reverse of FSA) and thenremove them from the original data in order to increase the signal to noise ratio.Liu et al. (1989), McCormack (1990) and Zhang and Li (1994) are Some of thepublications in this area.

Zhang and Li (1994) handled the simpler problem, to edit out the whole noisytrace from the record. They initiate the network in the “learning” phase by “scan-ning” over the whole data set. The weights are adapted in the learning phase eitherwith some human input as the distinguishing factors between “good” and “bad”

Page 283: Forging New Frontiers: Fuzzy Pioneers I

280 M. Nikravesh

traces or during an unsupervised learning phase. Then in the “recognizing” phasethe data are scanned again and depending upon whether the output of the network isless than or greater than a threshold level the trace is either left alone or edited outas a bad trace.

3.2 Identification and Prediction

Another major application area for ANN in the oil industry is to predict variousreservoir properties. This ultimately is used a decision tool for exploration and de-velopment drilling and redevelopment or extension of the existing fields. The inputdata to this prediction problem is usually processed and interpreted seismic and logdata and/or a set of attributes derived from the original data set. Historically, many“hydrocarbon indicators” have been proposed to make such predictions. Amongthem are: the bright spot analysis Sherif and Geldart (1982), amplitude versus offsetanalysis, Ostrander (1982), seismic clustering analysis, Aminzadeh and Chatterjee(1984), fuzzy pattern recognition, Griffiths (1987) and other analytical methods,Agterberg and Griffiths (1991). Many of the ANN developed for this purpose arebuilt around the earlier techniques either for establishing a relationship between theraw data and physical properties of the reservoirs and/or to train the network usingthe previously established relationships.

Huang and Williamson (1994) have developed a general regression neural net-work, GRNN to predict rock’s total organic carbon (TOC) using well log data. First,they model the relationship between the resistivity log and TOC with a GRNN,using published data. After training the ANN in two different modes, the GRNNfound optimum values of sigma. Sigma is an important smoothing parameter usedin GRNN. They have established the superiority of GRNN over BP-ANN in deter-mining the architecture of the network. After completing the training phase a pre-dictive equation for determining TOC was derived. Gogan et al. (1995) used ANNfor determining lithology and fluid saturation from well log and pre-stack seismicdata. Various seismic attributes from partial stacks (mid, near and far offsets) as aninput to ANN. The network was calibrated using synthetic (theoretical) data withpre stack seismic response of known lithologies and saturation from the well logdata. The output of the network was a set of classes of lithologies and saturations.

3.3 Neural Network and Nonlinear Mapping

In this section, a series of neural network models will be developed for nonlinearmapping between wireline logs. A series of neural network models will also bedeveloped to analyze actual welllog data and seismic information and the nonlinearmapping between wireline logs and seismic attributes will be recognized.

In this study, wireline logs such as travel time (DT), gamma ray (GR), and density(RHOB) will be predicted based on SP, and resistivity (RILD) logs. In addition, we

Page 284: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 281

will predict travel time (DT) based on induction resistivity and vice versa. In thisstudy, all logs are scaled uniformly between -1 and 1 and results are given in scaleddomain.

Figures 6A through 6E show typical behavior of SP, RILD, DT, GR, and RHOBlogs in scaled domain. The design of a neural network to predict DT, GR, and RHOBbased on RILD and SP logs starts with filtering, smoothing, and interpolating values(in a small horizon) for missing information in the data set. A first-order filter and asimple linear recursive parameter estimator for interpolating were used to filter andreconstruct the noisy data. The available data were divided into three data sets: train-ing, testing, and validation. The network was trained based on the training data setand continuously tested using a test data set during the training phase. The networkwas trained using a backpropagation algorithm and modified Levenberge-Marquardtoptimization technique. Training was stopped when prediction deteriorated withstep.

0 500 1000 1500 2000 2500 3000 3500 4000-0.5

0

0.5

1

SP

0 500 1000 1500 2000 2500 3000 3500 4000-0.5

0

0.5

1

RIL

D

0 500 1000 1500 2000 2500 3000 3500 4000-1

-0.5

0

0.5

DT

0 500 1000 1500 2000 2500 3000 3500 4000-1

-0.5

0

0.5

GR

0 500 1000 1500 2000 2500 3000 3500 4000-0.5

0

0.5

1

Sampling Space

RH

OB

a

b

c

d

e

Fig. 6 Typical behavior of SP, RILD, DT, GR, and RHOB logs

Page 285: Forging New Frontiers: Fuzzy Pioneers I

282 M. Nikravesh

3.3.1 Travel Time (DT) Prediction Based on SP and Resistivity (RILD) Logs

The neural network model to predict the DT has 14 input nodes (two windows ofdata each with 7 data points) representing SP (7 points or input nodes) and RILD(7 points or input nodes) logs. The hidden layer has 5 nodes. The output layer has3 nodes representing the prediction of the DT (a window of 3 data point). Typicalperformance of the neural network for training, testing, and validation data sets isshown in Figures 7A, 7B, 7C, and 7D. The network shows good performance forprediction of DT for training, testing, and validation data sets. However, there isnot a perfect match between actual and predicted values for DT is the testing andvalidation data sets. This is due to changes of the lithology from point to point. Inother words, some of the data points in the testing and validation data sets are in alithological layer which was not presented in the training phase. Therefore, to haveperfect mapping, it would be necessary to use the layering information (using othertypes of logs or linguistic information) as input into the network or use a larger dataset for the training data set which represent all the possible behaviors in the data.

3.3.2 Gamma Ray (GR) Prediction Based on SP and Resistivity (RILD) Logs

In this study, a neural network model is developed to predict GR based on SP andRILD log. The network has 14 input nodes (two windows of data, each with 7 datapoints) representing SP (7 points or input nodes) and RILD (7 points or input nodes)logs. The hidden layer has 5 nodes. The output layer has 3 nodes representing theprediction of the GR. Figures 8A through 8D show the performance of the neuralnetwork for training, testing, and validation data. The neural network model shows agood performance for prediction of GR for training, testing, and validation data sets.In comparison with previous studies (DT prediction), this study shows that the GRis not as sensitive as DT to noise in the data. In addition, a better global relationshipexits between SP-resisitivity-GR rather than SP-resisitivity-DT. However, the localrelationship is in the same order of complexity. Therefore, two models have thesame performance for training (excellent performance). However, the model forprediction of GR has a better generalization property. Since the two models havebeen trained based on the same criteria, it is unlikely that this lack of mapping forgeneralization is due to over fitting during the training phase.

3.3.3 Density (RHOB) Prediction Based on SP and Resistivity (RILD) Logs

To predict density based on SP and RILD logs, a neural network model with 14 inputnodes representing SP (7 points or input nodes) and RILD (7 points or input nodes)logs, 5 nodes in the hidden layer, and 3 nodes in the output layer representing theprediction of the RHOB is developed. Figures 9A through 9D show a typical perfor-mance of the neural network for the training, testing, and validation data sets. Thenetwork shows excellent performance for the training data set as shown in Figs. 9Aand 9B. The model shows a good performance for the testing data set as shown in

Page 286: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 283

0 50 100 150 200 250 300-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

sampling space

DT

aTraining data set, Solid Color: Actual, Light Color: Prediction

-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Actual

Pre

dict

ion

bTraining data set, Solid Color: Actual, Light Color: Prediction

Fig. 7 Typical neural network performance for prediction of DT based on SP and RILD

Fig. 9C. Figure 9D shows the performance of the neural network for the validationdata set. The model has relatively good performance for the validation data set.Therefore, there is not a perfect match between the actual and predicted values forRHOB for the testing and validation data set. Since RHOB is directly related tolithology and layering, and to have perfect mapping, it would be necessary to usethe layering information (using other types of logs or linguistic information) as an

Page 287: Forging New Frontiers: Fuzzy Pioneers I

284 M. Nikravesh

0 20 40 60 80 100-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

sampling space

DT

cTest data set, Solid Color: Actual, Light Color: Prediction

0 20 40 60 80 100-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

sampling space

DT

dValidation data set, Solid Color: Actual, Light Color: Prediction

Fig. 7 (continued)

input into the network or use a larger data set for the training data set which representall the possible behaviors in the data. In these cases, one can use a knowledge-basedapproach using knowledge of an expert and select more diverse information whichrepresent all different possible layering as a training data set. Alternatively, one canuse an automated clustering technique to recognize the important clusters existingin the data and use this information for selecting the training data set.

Page 288: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 285

0 50 100 150 200 250 300-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

sampling space

GR

Training data set, Solid Color: Actual, Light Color: Prediction a

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Actual

Pre

dict

ion

Training data set, Solid Color: Actual, Light Color: Prediction b

Fig. 8 Typical neural network performance for prediction of GR based on SP and RILD

3.3.4 Travel Time (DT) Prediction Based on Resistivity (RILD)

The neural network model to predict the DT has 11 input nodes representing a RILDlog. The hidden layer has 7 nodes. The output layer has 3 nodes representing theprediction of the DT. Using engineering knowledge, a training data set is carefully

Page 289: Forging New Frontiers: Fuzzy Pioneers I

286 M. Nikravesh

0 20 40 60 80 100-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

sampling space

GR

Test data set, Solid Color: Actual, Light Color: Prediction c

0 20 40 60 80 100-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

sampling space

GR

Validation data set, Solid Color: Actual, Light Color: Prediction d

Fig. 8 (continued)

selected so as to represent all the possible layering existing in the data. The typicalperformance of neural network for the training, testing, and validation data sets isshown in Figs. 10A through 10D. As expected, the network has excellent perfor-mance for prediction of DT. Even though only RILD logs are used for predictionof DT, the network model has better performance than when SP and RILD logs

Page 290: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 287

0 50 100 150 200 250 300-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

sampling space

RH

OB

Training data set, Solid Color: Actual, Light Color: Prediction a

- 0.8 - 0.6 - 0.4 - 0.2 0 0.2 0.4 0.6- 0.8

- 0.6

- 0.4

- 0.2

0

0.2

0.4

0.6

Actual

Training data set, Solid Color: Actual, Light Color: Prediction b

Pre

dict

ion

Fig. 9 Typical neural network performance for prediction of RHOB based on SP and RILD

used for prediction of DT (comparing Figs. 7A through 7D with 5A through 5D).However, in this study, knowledge of an expert was used as extra information. Thisknowledge not only reduced the complexity of the model, but also better predictionwas achieved.

Page 291: Forging New Frontiers: Fuzzy Pioneers I

288 M. Nikravesh

0 20 40 60 80 100-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

sampling space

RH

OB

Test data set, Solid Color: Actual, Light Color: Predictionc

0 20 40 60 80 100-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

sampling space

RH

OB

Validation data set, Solid Color: Actual, Light Color: Prediction d

Fig. 9 (continued)

3.3.5 Resistivity (RILD) Prediction Based on Travel Time (DT)

In this section, to show that the technique presented in the previous section is ef-fective, the performance of the inverse model is tested. The network model has 11input nodes representing DT, 7 nodes in the hidden layer, and 3 nodes in the output

Page 292: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 289

0 50 100 150 200 250 300- 0.5

- 0.4

- 0.3

- 0.2

- 0.1

0

0.1

0.2

0.3

0.4

sampling space

DT

Training data set, Solid Color: Actual, Light Color: Prediction a

- 0.6 - 0.4 - 0.2 0 0.2 0.4- 0.5

- 0.4

- 0.3

- 0.2

- 0.1

0

0.1

0.2

0.3

0.4

Actual

Pre

dict

ion

Training data set, Solid Color: Actual, Light Color: Prediction b

Fig. 10 Typical neural network performance for prediction of DT based on RILD

layer representing the prediction of the RILD. Figures 11A through 11D show theperformance of the neural network model for the training, testing, and validationdata sets. Figures 11A and 11B show that the neural network has excellent per-formance for the training data set. Figures 6C and 6D show the performance of

Page 293: Forging New Frontiers: Fuzzy Pioneers I

290 M. Nikravesh

0 20 40 60 80 100-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

sampling space

DT

Test data set, Solid Color: Actual, Light Color: Prediction c

0 20 40 60 80 100-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

sampling space

DT

Validation data set, Solid Color: Actual, Light Color: Prediction d

Fig. 10 (continued)

the network for the testing and validation data set. The network shows relativelyexcellent performance for testing and validation purposes. As was mentioned in theprevious section, using engineering knowledge the complexity of the model wasreduced and better performance was achieved.

Page 294: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 291

0 50 100 150 200 250 300-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

sampling space

RE

SIS

TIV

ITY

Training data set, Solid Color: Actual, Light Color: Predictiona

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6

b

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Actual

Pre

dict

ion

Training data set, Solid Color: Actual, Light Color: Prediction

Fig. 11 Typical neural network performance for prediction of RILD based on DT

In addition, since the network model (prediction of DT from resisitivity) andits inverse (prediction of resisitivity based on DT) have relatively excellent perfor-mance and generalization properties, a one-to-one mapping was achieved. Therefore,this implies that a good representation of layering was selected based on knowledgeof an expert.

Page 295: Forging New Frontiers: Fuzzy Pioneers I

292 M. Nikravesh

0 20 40 60 80 100

c

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

sampling space

RE

SIS

TIV

ITY

Test data set, Solid Color: Actual, Light Color: Prediction

0 20 40 60 80 100

d

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

sampling space

RE

SIS

TIV

ITY

Validation data set, Solid Color: Actual, Light Color: Prediction

Fig. 11 (continued)

Page 296: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 293

4 Fuzzy Logic

In recent years, it has been shown that uncertainty may be due to fuzziness ratherthan chance. Fuzzy logic is considered to be appropriate to deal with the nature ofuncertainty in system and human error, which are not included in current reliabilitytheories. The basic theory of fuzzy sets was first introduced by Zadeh (1965). Unlikeclassical logic which is based on crisp sets of “true and false”, fuzzy logic viewsproblems as a degree of “truth”, or “fuzzy sets of true and false” (Zadeh 1965).Despite the meaning of the word “fuzzy”, fuzzy set theory is not one that permitsvagueness. It is a methodology that was developed to obtain an approximate solutionwhere the problems are subject to vague description. In addition, it can help engi-neers and researchers to tackle uncertainty, and to handle imprecise information in acomplex situation. During the past several years, the successful application of fuzzylogic for solving complex problems subject to uncertainty has greatly increasedand today fuzzy logic plays an important role in various engineering disciplines.In recent years, considerable attention has been devoted to the use of hybrid neuralnetwork-fuzzy logic approaches as an alternative for pattern recognition, clustering,and statistical and mathematical modeling. It has been shown that neural networkmodels can be use to construct internal models that capture the presence of fuzzyrules. However, determination of the input structure and number of membershipfunctions for the inputs has been one of the most important issues of fuzzy modeling.

Fuzzy logic provides a completely new way of modeling complex and ill-definedsystems. The major concept of fuzzy logic is the use of a linguistic variable, that is avariable whose values are words or sentences in a natural or synthetic language. Thisalso leads to the use of fuzzy if-then rules, in which the antecedent and consequentsare propositions containing linguistic variables.

In recent years, fuzzy logic, or more generally, fuzzy set theory, has been appliedextensively in many reservoir characterization studies. This is mainly due to the factthat reservoir geology is mainly a descriptive science which uses mostly uncertain,imprecise, ambiguous and linguistic information (Bois 1984). Fuzzy set theory hasthe ability to deal with such information and to combine them with the quantita-tive observations. The applications are many, including seismic and stratigraphicmodeling and formation evaluation.

In Bois (1984), he proposed to use fuzzy set theory as a pattern recognition toolto interpret a seismic section. The algorithm produced a synthetic seismic sectionby convoluting a geological model with a representative impulse (by deconvolutionor signature of the source), which were both of subjective and fuzzy nature. Thesynthetic section was then compared with the original seismic section in a fuzzycontext. In essence, the algorithm searches for the appropriate geological modelfrom the observed seismic section by an iterative procedure, which is a popular wayfor solving inverse problems.

In Baygun et al. (1985), the authors used fuzzy logic as a classifier for the de-lineation of geological objects in a mature hydrocarbon reservoir with many wells.The authors showed that fuzzy logic can be used to extract dimensions and orien-tation of geological bodies, and geologists can use such a technique for reservoircharacterization in a practical way which bypassed many tedious steps.

Page 297: Forging New Frontiers: Fuzzy Pioneers I

294 M. Nikravesh

In Nordlund (1996), the author presented a study on dynamic stratigraphic mod-eling using fuzzy logic. In stratigraphic modeling, it is possible to model several pro-cesses simultaneously in space and time. Although many processes can be modeledusing conventional mathematics, modeling the change of deposition and erosion onsurfaces is often difficult. Formalizing geological knowledge is a difficult exerciseas it involves handling of several independent and complex parameters. In addition,most information is qualitative and imprecise, which are unacceptable for directnumerical treatment. In the paper, the author showed a successful use of fuzzy rulesto model erosion and deposition. The fuzzified variables included the depth of thereservoirs, the distances to source and shore, a sinusoidal sea-level curve, tectonicsubsidence rate and simulation time with depositional surface. From the study, theauthor demonstrated that a few (10) fuzzy rules could produce stratigraphies withrealistic geometry and facies.

In Cuddy (1997), the author applied fuzzy logic to solve a number of petro-physical problems in several North Sea fields. The work included lithofacies andpermeability from well logs. Lithofacies prediction was based on the use of a pos-sibility value (Gaussian function with a specific mean and variance) to represent awell log belonging to a certain lithofacies. The lithofacies that was associated withthe highest combined fuzzy possibility (multiplication of all values) was taken asthe most likely lithofacies for that set of logs. A similar methodology was appliedto predict permeability by dividing the core permeability values into ten equal binsizes on a log scale. The problem was converted into a classification problem. All theresults suggested that the fuzzy approach had given better petrophysical estimatescompared to the conventional techniques.

Fang and Chen (1997) also applied fuzzy rules to predict porosity and permeabil-ity from five compositional and textural characteristics of sandstone in the YachengField (South China Sea). The five input attributes were the relative amounts ofrigid grains, ductile grains and detrital matrix, grain size and the Trask sorting co-efficient. All the porosity and permeability data were firstly divided into certainnumber of clusters by fuzzy c-means. The corresponding sandstone characteris-tics for each cluster were used to general the fuzzy linguistic rules. Each fuzzycluster produced one fuzzy if-then rule with five input statements. The formulatedrules were then used to make linguistic prediction by combining individual con-clusion from each rule. If a numerical output was desired, a defuzzification algo-rithm (1997) could be used to extract a crisp output from a fuzzy set. The resultsshowed that the fuzzy modeling gave better results compared to those presented inBloch (1991).

In Huang et al. (1999), the authors presented a simple but practical fuzzy inter-polator for predicting permeability from well logs in the North West Shelf (offshoreAustralia). The basic idea was to simulate local fuzzy reasoning. When a new inputvector (well logs) was given, the system would select two training vectors whichwere nearest to the new input vector and build a set of piece-wise linear inferencerules with the training values, in which the membership value of the training valueswas one. The study used well log and core data from two wells and the performancewas tested at a third well, where actual core data were available for comparison. Theaccuracy of the permeability predictions at the test well was although similar to that

Page 298: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 295

obtained from the authors’ previous neural-fuzzy technique, the fuzzy approach was>7, 000 times faster in terms of CPU time.

In Nikravesh and Aminzadeh (2000), the authors applied a neural-fuzzy approachto develop an optimum set of rules for nonlinear mapping between porosity, grainsize, clay content, P-wave velocity, P-wave attenuation and permeability. The rulesdeveloped from a training set were used to predict permeability in another dataset. The prediction performance was very good. The study also showed that theintegrated technique discovered clear relationships between P-wave velocity andporosity, and P-wave attenuation and clay content, which were useful to geophysi-cists.

4.1 Geoscience Applications of Fuzzy Logic

The uncertain, fuzzy, and linguistic nature of geophysical and geological data makesit a good candidate for interpretation through fuzzy set theory. The main advantageof this technique is in combining the quantitative data and qualitative informationand subjective observation. The imprecise nature of the information available forinterpretation (such as seismic data, wirelin logs, geological and lithological data)makes fuzzy sets theory an appropriate tool to utilize. For example, Chappaz (1977)and Bois (1983, 1984) proposed to use fuzzy sets theory in the interpretation ofseismic sections. Bois used fuzzy logic as pattern recognition tool for seismic inter-pretation and reservoir analysis. He concluded that fuzzy set theory, in particular,can be used for interpretation of seismic data which are imprecise, uncertain, andinclude human error. He maintained these type of error and fuzziness cannot betaken into consideration by conventional mathematics. However, they are perfectlyseized by fuzzy set theory. He also concluded that using fuzzy set theory one candetermine the geological information using seismic data. Therefore, one can predictthe boundary of reservoir in which hydrocarbon exists. B. Baygun et al. (1985) usedfuzzy logic as classifier for delineation of geological objects in a mature hydrocar-bon reservoir with many wells. B. Baygun et al. have shown that fuzzy logic canbe used to extract dimensions and orientation of geological bodies and the geologistcan use such a technique for reservoir characterization in a very quick way throughbypassing several tedious steps. H. C. Chen et al. in their study used fuzzy set theoryas fuzzy regression analysis for extraction of parameter for Archie equation. Bezdeket al. (1981) also reported a series of the applications of fuzzy sets theory in geostat-ical analysis. Tamhane et al. (2002), show how to integrate linguistic descriptions inpetroleum reservoirs using fuzzy logic.

Many of our geophysical analysis techniques such as migration, DMO, waveequation modeling as well as the potential methods (gravity, magnetic, electricalmethods) use conventional partial differential wave equations with deterministiccoefficients. The same is true for the partial differential equations used in reservoirsimulation. For many practical and physical reasons deterministic parameters forthe coefficients of these PDEs leads unrealistic (for example medium velocities forseismic wave propagation or fluid flow for Darcy equation). Stochastic parameters

Page 299: Forging New Frontiers: Fuzzy Pioneers I

296 M. Nikravesh

in theses cases can provide us with a more practical characterization. Fuzzy co-efficients for PDEs can prove to be even more realistic and easy to parameterize.Today’s deterministic processing and interpretation ideas will give way to stochasticmethods, even if the industry has to rewrite the book on geophysics. That is, usingwave equations with random and fuzzy coefficients to describe subsurface velocitiesand densities in statistical and membership grade terms, thereby enabling a betterdescription of wave propagation in the subsurface–particularly when a substantialamount of heterogeneity is present. More generalized applications of geostatisticaltechniques will emerge, making it possible to introduce risk and uncertainty at theearly stages of the seismic data processing and interpretation loop.

5 Neuro-Fuzzy Techniques

In recent years, considerable attention has been devoted to the use of hybrid neural-network/fuzzy-logic approaches as an alternative for pattern recognition, clustering,and statistical and mathematical modeling. It has been shown that neural networkmodels can be used to construct internal models that recognize fuzzy rules.

Neuro-fuzzy modeling is a technique for describing the behavior of a system us-ing fuzzy inference rules within a neural network structure. The model has a uniquefeature in that it can express linguistically the characteristics of a complex nonlin-ear system. As a part of any future research opportunities, we will use the neuro-fuzzy model originally presented by Sugeno and Yasukawa (1993). The neuro-fuzzymodel is characterized by a set of rules. The rules are expressed as follows:

Ri : if x1 is A1i and x2 is A2

i . . . and xn is Ani (Antecedent) (1)

then y∗ = fi(x1, x2, . . . , xn) (Consequent)

where fi(x1, x2, . . . , xn) can be constant, linear, or a fuzzy set.

For the linear case : fi(x1, x2, . . . , xn) = ai0 + ai1 x1 + ai2 x2 + . . .+ ain xn (2)

Therefore, the predicted value for output y is given by:

y = Σi μi fi(x1, x2, . . . , xn)/Σμi (3)

with

μi = �j Aji(xj) (4)

where Ri is the ith rule, xj are input variables, y is output, Aji are fuzzy member-

ship functions (fuzzy variables), aij constant values.As a part of any future research opportunities, we will use the Adaptive Neuro-

Fuzzy Inference System (ANFIS) technique originally presented by Jang (1992).The model uses neuro-adaptive learning techniques, which are similar to those ofneural networks. Given an input/output data set, the ANFIS can construct a fuzzyinference system (FIS) whose membership function parameters are adjusted using

Page 300: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 297

the back-propagation algorithm or other similar optimization techniques. This al-lows fuzzy systems to learn from the data they are modeling.

5.1 Neural-Fuzzy Model for Rule Extraction

In this section, a neuro-fuzzy model will be developed for model identification andknowledge extraction (rule extraction) purposes. The model is characterized by aset of rules which can be further used for representation of data in the form oflinguistic variables. Therefore, in this situation the fuzzy variables become linguisticvariables. The neuro-fuzzy technique is used to implicitly cluster the data whilefinding the nonlinear mapping. The neuro-fuzzy model developed in this study isan approximate fuzzy model with triangular and Gaussian membership functionsoriginally presented by Sugeno and Yasukawa (1993). K-Mean technique is usedfor clustering and the network is trained using a backpropagation algorithm andmodified Levenberge-Marquardt optimization technique.

In this study, the effect of rock parameters and seismic attenuation on permeabil-ity will be analyzed based on soft computing techniques and experimental data. Thesoftware will use fuzzy logic techniques because the data and our requirements areimperfect. In addition, it will use neural network techniques, since the functionalstructure of the data is unknown. In particular, the software will be used to groupdata into important data sets; extract and classify dominant and interesting patternsthat exist between these data sets; and discover secondary, tertiary and higher-orderdata patterns. The objective of this section is to predict the permeability based ongrain size, clay content, porosity, P-wave velocity, and P-wave attenuation.

5.2 Prediction of Permeability Based on Porosity, Grain Size,Clay Content, P-wave Velocity, and P-wave Attenuation

In this section, a neural-fuzzy model will be developed for nonlinear mapping andrule extraction (knowledge extraction) between porosity, grain size, clay content,P-wave velocity, P-wave attenuation and permeability. Figure 12 shows typical data,which has been used in this study. In this study, permeability will be predicted basedon the following rules and equations. (1) through (4):

IF Rock Type= Sandstones[5]

AND Porosity=[p1,p2]AND Grain Size =[g1,g2]AND Clay Content =[c1,c2]AND P-Wave Vel.=[pwv1,pwv2]AND P-Wave Att.=[pwa1,pwa2]

THEN Y∗ = a0+ a1∗P+ a2∗G+ a3∗C+ a4∗PWV+ a5∗PWA.

Page 301: Forging New Frontiers: Fuzzy Pioneers I

298 M. Nikravesh

0 5 10 15 20 25 30 35 400

200

400G

rain micron

0 5 10 15 20 25 30 35 400

20

40

Poro

sity

percent

0 5 10 15 20 25 30 35 400

20

40

Cla

y C

onte

nt

percent

0 5 10 15 20 25 30 35 402000

4000

6000

P-w

ave

Vel

.

m/sec

0 5 10 15 20 25 30 35 400

5

10

P-w

ave

Att.

dB/cm

0 5 10 15 20 25 30 35 400

200

400

Perm

eabi

lity

Samples

mD

Fig. 12 Actual data (Boadu 1997)

Where, P is %porosity, G is grain size, C is clay content, PWV is P-wave velocity,and P-wave attenuation, Y∗, is equivalent to f in (1).

Data are scaled uniformly between −1 and 1 and the result is given in the scaleddomain. The available data were divided into three data sets: training, testing, andvalidation. The neuro-fuzzy model was trained based on a training data set and con-tinuously tested using a test data set during the training phase. Training was stoppedwhen it was found that the model’s prediction suffered upon continued training.Next, the number of rules was increased by one and training was repeated. Usingthis technique, an optimal number of rules were selected. Figures 13A through 13Eand Table 5 show typical rules extracted from the data. In Table 1, Column 1 through5 show the membership functions for porosity, grain size, clay content, P-wave ve-locity, and P-wave attenuation respectively.

Using the model defined by equations 1 through 4 and membership functionsdefined in Figs. 13A through 13E and Table 5, permeability was predicted as shownin Fig. 14A. In this study, 7 rules were identified for prediction of permeabilitybased on porosity, grain size, clay content, P-wave velocity, and P-wave attenuation

Page 302: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 299

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Porosity

Mem

bers

hip

Func

tion

1 76

4, 5

2, 3

Fig. 13A Typical rules extracted from data, 7 Rules (Porosity)

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Grain Size

Mem

bers

hip

Func

tion

17

6 32 4

5

Fig. 13B Typical rules extracted from data, 7 Rules (Grain Size)

Page 303: Forging New Frontiers: Fuzzy Pioneers I

300 M. Nikravesh

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Clay Content

Mem

bers

hip

Func

tion

1 76

3, 4, 5

2

Fig. 13C Typical rules extracted from data, 7 Rules (Clay Content)

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P-Wave Velocity

Mem

bers

hip

Func

tion

17 6

3, 42

5

Fig. 13D Typical rules extracted from data, 7 Rules (P-Wave Velocity)

Page 304: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 301

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.40

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

P-Wave attenuation

Mem

bers

hip

Func

tion

1 7632

4, 5

Fig. 13E Typical rules extracted from data, 7 Rules (P-Wave Attenuation)

(Figs. 13A through 13E and Fig. 14A). In addition, 8 rules were identified for pre-diction of permeability based on porosity, clay content, P-wave velocity, and P-waveattenuation (Fig. 14B). Ten rules were identified for prediction of permeability basedon porosity, P-wave velocity, and P-wave attenuation (Fig. 14C). Finally, 6 ruleswere identified for prediction of permeability based on grain size, clay content, P-wave velocity, and P-wave attenuation (Fig. 14D). The neural network model showsvery good performance for prediction of permeability. In this situation, not onlya nonlinear mapping and relationship was identified between porosity, grain size,clay content, P-wave velocity, and P-wave attenuation, and permeability, but therules existing between data were also identified. For this case study, our softwareclustered the parameters as grain size, P-wave velocity/porosity (as confirmed byFig. 15 since a clear linear relationship exists between these two variables), andP-wave attenuation/clay content (as it is confirmed by Fig. 16 since an approximate

Table 5 Boundary of rules extracted from data

Porosity Grain Size Clay Content P_Wave Velocity P_Wave Attenuation

[-0.4585, -0.3170] [-0.6501, -0.3604] [-0.6198, -0.3605] [-0.0893, 0.2830] [-0.6460 -0.3480][0.4208, 0.5415] [-0.9351, -0.6673] [0.2101, 0.3068] [-0.7981, -0.7094] [0.0572 0.2008][-0.3610, -0.1599] [-0.7866, -0.4923] [-0.3965, -0.1535] [-0.0850, 0.1302] [-0.4406 -0.1571][-0.2793, -0.0850] [-0.5670, -0.2908] [-0.4005, -0.1613] [-0.1801, 0.0290] [-0.5113 -0.2439][-0.3472, -0.1856] [-0.1558, 0.1629] [-0.8093, -0.5850] [0.1447, 0.3037] [-0.8610 -0.6173][0.2700, 0.4811] [-0.8077, -0.5538] [-0.0001, 0.2087] [-0.6217, -0.3860] [-0.1003 0.1316][-0.2657, -0.1061] [0.0274, 0.3488] [-0.4389, -0.1468] [-0.1138, 0.1105] [-0.5570 -0.1945]

Page 305: Forging New Frontiers: Fuzzy Pioneers I

302 M. Nikravesh

-1 -0.5 0 0.5 1-1.5

-1

-0.5

0

0.5

1

Actual Pemeability

Pred

icte

d Pe

mea

bilit

yK = f(P,G,C,PWV,PWA)

Red: 5 Rules

Blue: 10 Rules

Green: 7 Rules

Fig. 14A Performance of Neural-Fuzzy model for prediction of permeability [ K = f (P, G, C,PWV, PWA)]

-1 -0.5 0 0.5

b

1-1.5

-1

-0.5

0

0.5

1

Actual Permeability

Pre

dict

ed P

erm

eabi

lity

Red: 4 Rules

Blue: 8 Rules

K = f(P,C,PWV,PWA)

Green: 6 Rules

Fig. 14B Performance of Neural-Fuzzy model for prediction of permeability [ K = f (P, C, PWV,PWA)]

Page 306: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 303

-1 -0.5 0 0.5 1-1.5

-1

-0.5

0

0.5

1

Actual Permeability

Pre

dict

ed P

erm

eabi

lity

Red: 3 Rules

Blue: 10 Rules

K = f(P,PWV,PWA)

Megenda: 5 Rules

Green: 7 Rules

Fig. 14C Performance of Neural-Fuzzy model for prediction of permeability [ K = f (P, PWV,PWA)]

-1 -0.5 0 0.5 1-1.5

-1

-0.5

0

0.5

1

Actual Permeability

Pre

dict

ed P

erm

eabi

lity

Red: 4 Rules

Green: 6 Rules

Blue: 8 Rules

K = f(G,C,PWV,PWA)

Fig. 14D Performance of Neural-Fuzzy model for prediction of permeability [ K = f (G, C, PWV,PWA)]

Page 307: Forging New Frontiers: Fuzzy Pioneers I

304 M. Nikravesh

-1 -0.5 0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

P-Wave Velocity

Por

osity

Fig. 15 Relationship between P-Wave Velocity and Porosity

-1 -0.5 0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

P-Wave Attenuation

Cla

y C

onte

nt

Fig. 16 Relationship between P-Wave Attenuation and Clay Content

Page 308: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 305

linear relationship exists between these two variables). In addition, using the rulesextracted, it was shown that P-wave velocity is closely related to porosity and P-wave attenuation is closely related to clay content. Boadu (1997) also indicatedthat the most influential rock parameter on the attenuation is the clay content. Inaddition our software ranked the variables in the order grain size, p-wave velocity,p-wave attenuation and clay content/porosity (since clay content and porosity canbe predicted from p-wave velocity and p-wave attenuation).

6 Genetic Algorithms

Evolutionary computing represents computing with the use of some known mech-anisms of evolution as key elements in algorithmic design and implementation. Avariety of algorithms have been proposed. They all share a common conceptual baseof simulating the evolution of individual structures via processes of parent selec-tion, mutation, crossover and reproduction. The major one is the genetic algorithms(GAs) (Holland, 1975).

Genetic algorithm (GA) is one of the stochastic optimization methods which issimulating the process of natural evolution. GA follows the same principles as thosein nature (survival of the fittest, Charles Darwin). GA first was presented by JohnHolland as an academic research. However, today GA turn out to be one of themost promising approaches for dealing with complex systems which at first nobodycould imagine that from a relative modest technique. GA is applicable to multi-objectives optimization and can handle conflicts among objectives. Therefore, it isrobust where multiple solution exist. In addition, it is highly efficient and it is easyto use.

Another important feature of GA is its ability to extract knowledge in termsof fuzzy rules. GA is now widely used and applied to discovery of fuzzy rules.However, when the data sets are very large, it is not easy to extract the rules. Toovercome such a limitation, a new coding technique has been presented recently.The new coding method is based on biological DNA. The DNA coding methodand the mechanism of development from artificial DNA are suitable for knowledgeextraction from large data set. The DNA can have many redundant parts which isimportant for extraction of knowledge. In addition, this technique allows overlappedrepresentation of genes and it has no constraint on crossover points. Also, the sametype of mutation can be applied to every locus. In this technique, the length of chro-mosome is variable and it is easy to insert and/or delete any part of DNA. Today,genetic algorithm can be used in a hierarchical fuzzy model for pattern extractionand to reduce the complexity of the neuro-fuzzy models. In addition, GA can beuse to extract the number of the membership functions required for each parameterand input variables, and for robust optimization along the multidimensional, highlynonlinear and non-convex search hyper-surfaces.

GAs work by firstly encoding the parameters of a given estimator as chromo-somes (binary or floating-point). This is followed by populating a range of potentialsolutions. Each chromosome is evaluated by a fitness function. The better parent

Page 309: Forging New Frontiers: Fuzzy Pioneers I

306 M. Nikravesh

solutions are reproduced and the next generation of solutions (children) is generatedby applying the genetic operators (crossover and mutation). The children solutionsare evaluated and the whole cycle repeats until the best solution is obtained.

The methodology is in fact general and can be applied to optimizing parametersin other soft computing techniques, such as neural networks. In Yao (1999), theauthor gave an extensive review of the use of evolutionary computing in neuralnetworks with more than 300 references. Three general areas are: evolution of con-nection weights; evolution of neural network architectures; and evolution of learningrules.

Most geoscience applications began in early 1990s. Gallagher and Sambridge(1994) presented an excellent overview on the use of GAs in seismology. Otherapplications include geochemical analysis, well logging and seismic interpretation.

Fang et al. first used GAs to predict porosity and permeability from composi-tional and textural information and the Archie parameters in petrophysics. The sameauthors later used the same method to map geochemical data into a rock’s mineralcomposition (1996). The performance was much better than the results obtainedfrom linear regression and nonlinear least-squares methods.

In Huang et al. (1998), the authors used GAs to optimize the connection weightsin a neural network for permeability prediction from well logs. The study showedthat the GA-trained networks (neural-genetic model) gave consistently smaller er-rors compared to the networks trained by the conventional gradient descent algo-rithm (backpropagation). However, GAs were comparatively slow in convergence.In Huang et al. (2000), the same authors initialized the connection weights inGAs using the weights trained by backpropagation. The technique was also inte-grated with fuzzy reasoning, which gave a hybrid system of neural-fuzzy-genetic(Huang, 1998). This improved the speed of convergence and still obtained betterresults.

Another important feature of GAs is its capability of extracting fuzzy rules. How-ever, this becomes unpractical when the data sets are large in size. To overcome this,a new encoding technique has been presented recently, which is based on the under-standing of biological DNA. Unlike the conventional chromosomes, the length ofchromosome is variable and it is flexible to insert new parts and/or delete redundantparts. In Yashikawa et al. (1998) and Nikravesh et al. (1998), the authors used ahybrid system of neural-fuzzy-DNA model for knowledge extraction from seismicdata, mapping the well logs into seismic data and reconstruction of porosity basedon multi-attributes seismic mapping.

6.1 Geoscience Applications of Genetic Algorithms

Most of the applications of the GA in the area of petroleum reservoir or in the areaof geoscience are limited to inversion techniques or used as optimization technique.While in other filed, GA is used as a powerful tool for extraction of knowledge,fuzzy rules, fuzzy membership, and in combination with neural network and fuzzy-logic. Recently, Nikravesh et. al, (?) proposed to use a neuro-fuzzy-genetic model

Page 310: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 307

for data mining and fusion in the area of geoscience and petroleum reservoirs. Inaddition, it has been proposed to use neuro-fuzzy-DNA model for extraction ofknowledge from seismic data and mapping the wireline logs into seismic data andreconstruction of porosity (and permeability if reliable data exist for permeability)based on multi-attributes seismic mapping. Seismic inversion was accomplishedusing genetic algorithms by Mallick (1999). Potter et al (1999) used GA for strati-graphic analysis. For an overview of GA in exploration problems see McCormacket al (1999)

7 Principal Component Analysis and Wavelet

Some of the data Fusion and data mining methods used in exploration applicationsare as follows.

First we need to reduce the space to make the data size more manageableas well as reducing the time required for data processing. We can use Princi-pal Component Analysis. Using the eigen value and vectors, we can reduce thespace domain. We choose the eigenvector corresponding to the largest eigenval-ues. Then in the eigenvector space we use Fuzzy K-Mean or Fuzzy C-Mean tech-nique. For details of Fuzzy C-Means algorithm see Cannon et al (1986). Also,see Lashgari (1991), Aminzadedh (1989) and Aminzadeh (1994) for the applica-tion of Fuzzy Logic and Fuzzy K-Means algorithm in several earth explorationproblems.

We can also use Wavelet and extract the patterns and Wavelets describing dif-ferent geological settings and the respective rock properties. Using the Waveletand neural network, we can fuse the data for nonlinear modeling. For clusteringpurposes, we can use the output from Wavelet and use Fuzzy C-Mean or FuzzyK-Mean. To use uncertainty and see the effect of the uncertainty, it is easy to addthe distribution to each point or some weight for importance of the data points. Oncewe assign some weight to each point, then we can correspond each weight to numberof points in a volume around each point.

Of course the techniques based on principal component analysis has certain limi-tations. One of the limitations is when SNR is negative or zero causing the techniqueto fail. The reason for this is the singularity of the variance and covariance matrices.Therefore, an important step is to use KF or some sort of Fuzzy set theory for noisereduction and extraction of Signal.

8 Intelligent Reservoir Characterization

In reservoir engineering, it is important to characterize how 3-D seismic infor-mation is related to production, lithology, geology, and logs (e.g. porosity, den-sity, gamma ray, etc.) (Boadu 1997; Nikravesh 1998a-b; Nikravesh et al., 1998;Chawathe et al. 1997; Yoshioka et al 1996; Schuelke et al. 1997; Monson and

Page 311: Forging New Frontiers: Fuzzy Pioneers I

308 M. Nikravesh

Pita 1997, Aminzadeh and Chatterjee, 1985). Knowledge of 3-D seismic data willhelp to reconstruct the 3-D volume of relevant reservoir information away fromthe well bore. However, data from well logs and 3-D seismic attributes are oftendifficult to analyze because of their complexity and our limited ability to understandand use the intensive information content of these data. Unfortunately, only lin-ear and simple nonlinear information can be extracted from these data by standardstatistical methods such as ordinary Least Squares, Partial Least Squares, and non-linear Quadratic Partial Least-Squares. However, if a priori information regardingnonlinear input-output mapping is available, these methods become more useful.

Simple mathematical models may become inaccurate because several assump-tions are made to simplify the models in order to solve the problem. On the otherhand, complex models may become inaccurate if additional equations, involving amore or less approximate description of phenomena, are included. In most cases,these models require a number of parameters that are not physically measurable.Neural networks (Hecht-Nielsen 1989) and fuzzy logic (Zadeh 1965) offer a thirdalternative and have the potential to establish a model from nonlinear, complex,and multi-dimensional data. They have found wide application in analyzing ex-perimental, industrial, and field data (Baldwin et al. 1990; Baldwin et al. 1989;Pezeshk et al. 1996; Rogers et al. 1992; Wong et al. 1995a, 1995b; Nikraveshet al. 1996; Nikravesh and Aminzadeh, 1997). In recent years, the utility of neuralnetwork and fuzzy logic analysis has stimulated growing interest among reservoirengineers, geologists, and geophysicists (Nikravesh et al. 1998; Nikravesh 1998a;Nikravesh 1998b; Nikravesh and Aminzadeh 1998; Chawathe et al. 1997; Yoshikaet al. 1996; Schuelke et al. 1997; Monson and Pita 1997; Boadu 1997; Klimen-tos and McCann 1990; Aminzadeh and Katz 1994). Boadu (1997) and Nikraveshet al. (1998) applied artificial neural networks and neuro-fuzzy successfully to findrelationships between seismic data and rock properties of sandstone. In a recentstudy, Nikravesh and Aminzadeh (1999) used an artificial neural network to fur-ther analyze data published by Klimentos and McCann (1990) and analyzed byBoadu (1997). It was concluded that to find nonlinear relationships, a neural networkmodel provides better performance than does a multiple linear regression model.Neural network, neuro-fuzzy, and knowledge-based models have been successfullyused to model rock properties based on well log databases (Nikravesh, 1998b).

Monson and Pita (1997), Chawathe et al. (1997) and Nikravesh (1998b) appliedartificial neural networks and neuro-fuzzy techniques successfully to find the rela-tionships between 3-D seismic attributes and well logs and to extrapolate mappingaway from the well bore to reconstruct log responses.

Adams et al. (1999a and 1999b), Levey et al. (1999), Nikravesh et al. (1999aand 1999b) showed schematically the flow of information and techniques to beused for intelligent reservoir characterization (IRESC) (Fig. 17). The main goalwill be to integrate soft data such as geological data with hard data such as 3-Dseismic, production data, etc. to build a reservoir and stratigraphic model. Nikraveshet al. (1999a and 1999b) were developed a new integrated methodology to identifya nonlinear relationship and mapping between 3-D seismic data and production-logdata and the technique was applied to a producing field. This advanced data analy-sis and interpretation methodology for 3-D seismic and production-log data uses

Page 312: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 309

Geological DataSoft Data

User

Economic and Cost Data

Risk Assessment

Inference Engineor Kernel

UserInterface

Reservoir Engineering DataLog Data Seismic DataMechanical Well Data

Hard Data

ReservoirModel

StratigraphicModel

Fig. 17 Integrated Reservoir Characterization (IRESC)

conventional statistical techniques combined with modern soft-computing tech-niques. It can be used to predict: 1. mapping between production-log data andseismic data, 2. reservoir connectivity based on multi-attribute analysis, 3. payzone recognition, and 4. optimum well placement (Fig. 18). Three criteria havebeen used to select potential locations for infill drilling or recompletion (Nikraveshet al., 1999a and 1999b): 1. continuity of the selected cluster, 2. size and shape ofthe cluster, and 3. existence of high Production-Index values inside a selected clusterwith high Cluster-Index values. Based on these criteria, locations of the new wellswere selected, one with high continuity and potential for high production and onewith low continuity and potential for low production. The neighboring wells that arealready in production confirmed such a prediction (Fig. 18).

Although these methodologies have limitations, the usefulness of the techniqueswill be for fast screening of production zones with reasonable accuracy. This newmethodology, combined with techniques presented by Nikravesh (1998a, 1998b),Nikravesh and Aminzadeh (1999), and Nikravesh et al. (1998), can be used to re-construct well logs such as DT, porosity, density, resistivity, etc. away from the wellbore. By doing so, net-pay-zone thickness, reservoir models, and geological repre-sentations will be accurately identified. Accurate reservoir characterization throughdata integration is an essential step in reservoir modeling, management, and produc-tion optimization.

8.1 Reservoir Characterization

Figure 17 shows schematically the flow of information and techniques to be usedfor intelligent reservoir characterization (IRESC). The main goal is to integrate soft

Page 313: Forging New Frontiers: Fuzzy Pioneers I

310 M. Nikravesh

New Well Locations

Well Fh

MediumProduction

Well Eh

Low Production

Well Dh

MediumProduction

Well Bh

High Production

Well Ah

High Production

Well Ch

MediumProduction

etoP

f lait

n

rooL

w

noitc

udor

p

Potential for

High

production

Pro

duct

ion

Inde

xxedn I retsul

C

Fig. 18 Optimal well placement (Nikravesh et al., 1999a and 1999b)

data such as geological data with hard data such as 3-D seismic, production data, etc.to build reservoir and stratigraphic models. In this case study, we analyzed 3-D seis-mic attributes to find similarity cubes and clusters using three different techniques:1. k-means, 2. neural network (self-organizing map), and 3. fuzzy c-means. Theclusters can be interpreted as lithofacies, homogeneous classes, or similar patternsthat exist in the data. The relationship between each cluster and production-log datawas recognized around the well bore and the results were used to reconstruct andextrapolate production-log data away from the well bore. The results from clusteringwere superimposed on the reconstructed production-log data and optimal locationsto drill new wells were determined.

8.1.1 Examples

Our example are from fields that produce from the Ellenburger Group. The Ellen-burger is one of the most prolific gas producers in the conterminous United States,with greater than 13 TCF of production from fields in west Texas. The EllenburgerGroup was deposited on an Early Ordovician passive margin in shallow subtidal tointertidal environments. Reservoir description indicates the study area is affected bya karst-related, collapsed paleocave system that acts as the primary reservoir in thefield studied (Adams et al., 1999; Levey et al., 1999).

Page 314: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 311

8.1.2 Area 1

The 3-D seismic volume used for this study has 3,178,500 data points (Table 6).Two hundred, seventy-four well-log data points intersect the seismic traces. Eighty-nine production log data points are available for analysis (19 production and 70non-production). A representative subset of the 3-D seismic cube, production logdata, and an area of interest were selected in the training phase for clustering andmapping purposes. The subset (150 samples, with each sample equal to 2 msec ofseismic data or approximately 20 feet of Ellenburger dolomite) was designed as asection (670 seismic traces) passing through all the wells as shown in Fig. 19 andhas 100,500 (670∗150) data points. However, only 34,170 (670∗51) data points wereselected for clustering purposes, representing the main Ellenburger focus area. Thissubset covers the horizontal boreholes of producing wells, and starts approximately15 samples (300 feet) above the Ellenburger, and ends 20 samples (400 feet) belowthe locations of the horizontal wells. In addition, the horizontal wells are presentin a 16-sample interval, for a total interval of 51 samples (102 msec or 1020 feet).Table 6 shows typical statistics for this case study. Figure 20 shows a schematicdiagram of how the well path intersects the seismic traces. For clustering and map-ping, there are two windows that must be optimized, the seismic window and thewell log window. Optimal numbers of seismic attributes and clusters need to bedetermined, depending on the nature of the problem. Figure 21 shows the iterativetechnique that has been used to select an optimal number of clusters, seismic at-tributes, and optimal processing windows for the seismic section shown in Fig. 19.Expert knowledge regarding geological parameters has also been used to constrainthe maximum number of clusters to be selected. In this study, six attributes havebeen selected (Raw Seismic, Instantaneous Amplitude, Instantaneous Phase, CosineInstantaneous Phase, Instantaneous Frequency, and Integrate Absolute Amplitude)out of 17 attributes calculated (Table 7).

Figures 22 through 27 show typical representations of these attributes in ourcase study. Ten clusters were recognized, a window of one sample was used as theoptimal window size for the seismic, and a window of three samples was used for the

Table 6 Typical statistics for main focus area, Area 1, and Ellenburger

DataCube Section

InLine: 163 Total Number of Traces: 670Xline: 130 Time Sample: 150Time Sample: 150 Total Number of Points: 100,500Total Number of Points: 3,178,500 Used for Clustering: 34, 170

Section/Club=%3.16For Clustering: %1.08

Well Data Production DataTotal Number of Points: 274 Total Number of Points: 89Well Data/Section: %0.80 Production: 19Well Data/Cube: %0.009 No Production: 70

Production Data/Section:%0.26Production Data/Cube:%0.003

Page 315: Forging New Frontiers: Fuzzy Pioneers I

312 M. Nikravesh

Well Dh

Well Eh

Well Cv

Well Ah

Well Bh2

Well Fh

Well Ch2

Well Ch1

Well Dv

Well Bh3

Well Bh1

Well pathSection path

miles0.1 0. 0.1 0.3

Well Eh

0.50.40.2

Fig. 19 Seismic section passing through all the wells, Area 1

production log data. Based on qualitative analysis, specific clusters with the poten-tial to be in producing zones were selected. Software was developed to do the qual-itative analysis and run on a personal computer using MatlabTMsoftware. Figure 28shows typical windows and parameters of this software. Clustering was based onthree different techniques, k-means (statistical), neural network , and fuzzy c-means

Seismic TracesWell Path

Seismic WindowWell Log Window

Fig. 20 Schematic diagram of how the well path intersects the seismic traces

Page 316: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 313

Optimal Number ofClusters

Optimal Number of Attributes

Optimal Processing Window and Sub-Window

...,.cte

setubirttAcimsieswaR

epolevneedutilpmAycneuqerFsuoenatnatsnI

esahPsuoenatnatsnIesahpsuoenatnatsnienisoCedutilpmaetulosbadetargetnI

This part needs generation of pseudologs from seismicusing advanced techniques. (See IRESC Model)

Section

SSeismic SeismicSeismic/LogsLogs Seismic/LogsLogs

Cube

This will be generated around the well-bore.

Gas/No-Gas, Breccia /No BrecciaDOL, LS, SH, CHERT, Others (SS, COAL)

Fig. 21 Iterative technique to select an optimal number of clusters, seismic attributes, and optimalprocessing windows

clustering. Different techniques recognized different cluster patterns as shown bythe cluster distributions (Figs. 29A through 31). Figures 29-31 through 14 show thedistribution of clusters in the section passing through the wells as shown in Fig. 19.By comparing k-mean (Fig. 12A) and neural network clusters (Figure 29) with fuzzyclusters (Figure 14), one can conclude that the neural network predicted a different

Table 7 List of the attributes calculated in this study

Attribute No. abbrev. Attribute

1. ampenv Amplitude Envelope2. ampwcp Amplitude Weighted Cosine Phase3. ampwfr Amplitude Weighted Frequency4. ampwph Amplitude Weighted Phase5. aprpol Apparent Polarity6. avgfre Average Frequency7. cosiph Cosine Instantaneous Phase8. deriamp Derivative Instantaneous Amplitude9. deriv Derivative10. domfre Dominant Frequency11. insfre Instantaneous Frequency12. inspha Instantaneous Phase13. intaamp Integrated Absolute Amplitude14. integ Integrate15. raw Raw Seismic16. sdinam Second Derivative Instantaneous Amplitude17. secdev Second Derivative

Page 317: Forging New Frontiers: Fuzzy Pioneers I

314 M. Nikravesh

Fig. 22 Typical time slice of Raw Seismic in Area 1 with Rule 1

Fig. 23 Typical time slice of Amplitude envelope in Area 1 with Rule 1

Page 318: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 315

Fig. 24 Typical time slice of Instantaneous Phase in Area 1 with Rule 1

Fig. 25 Typical time slice of Cosine Instantaneous Phase in Area 1 with Rule 1

Page 319: Forging New Frontiers: Fuzzy Pioneers I

316 M. Nikravesh

Fig. 26 Typical time slice of Instantaneous Frequency in Area 1 with Rule 1

Fig. 27 Typical time slice of Integrated Absolute Amplitude in Area 1 with Rule 1

Page 320: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 317

Fig. 28 MatLabTM software to do the qualitative analysis

structure and patterns than did the other techniques. Figures 29B and 29C show atypical time-slice from the 3-D seismic cube that has been reconstructed with theextrapolated k-means cluster data. Finally, based on a qualitative analysis, specificclusters that have the potential to include producing zones were selected. Each clus-tering technique produced two clusters that included most of the production data.Each of these three pairs of clusters is equivalent.

To confirm such a conclusion, cluster patterns were generated for the sectionpassing through the wells as shown in Fig. 19. Figures 32 through 34 show thetwo clusters from each technique that correlate with production: clusters one andfour from k-means clustering (Fig. 32); clusters one and six from neural net-work clustering (Fig. 33); and clusters six and ten from fuzzy c-means clustering(Fig. 34). By comparing these three cross sections, one can conclude that, in thepresent study, all three techniques predicted the same pair of clusters based on theobjective of predicting potential producing zones. However, this may not alwaysbe the case because information that can be extracted by the different techniquesmay be different. For example, clusters using classical techniques will have sharpboundaries whereas those generated using the fuzzy technique will have fuzzyboundaries.

Based on the clusters recognized in Figures 32 through 34 and the production logdata, a subset of the clusters has been selected and assigned as cluster 11 as shown inFigures 35 and 36. In this sub-cluster, the relationship between production-log dataand clusters has been recognized and the production-log data has been reconstructedand extrapolated away from the well bore. Finally, the production-log data and the

Page 321: Forging New Frontiers: Fuzzy Pioneers I

318 M. Nikravesh

Fig. 29A Typical k-means distribution of clusters in the section passing through the wells as shownin Fig. 19

Fig. 29B Typical 2-D presentation of time-slice of k-means distribution of clusters

Page 322: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 319

Fig. 29C Typical 3-D presentation of time-slice of k-means distribution of clusters

Fig. 30 Typical neural network distribution of clusters in the section passing through the wells asshown in Fig. 19

Page 323: Forging New Frontiers: Fuzzy Pioneers I

320 M. Nikravesh

Fig. 31 Typical fuzzy c-means distribution of clusters in the section passing through the wells asshown in Fig. 19

cluster data were superimposed at each point in the 3-D seismic cube. Figures 37Aand 37B (Fig. 18) show a typical time-slice of a 3-D seismic cube that has beenreconstructed with the extrapolated production-log data and cluster data. The colorscale in Figs. 37A and 20B is divided into two indices, Cluster Index and ProductionIndex. Criteria used to define Cluster Indices for each point are expressed as a seriesof dependent IF-THEN statements.

Before defining Production Indices for each point within a specified cluster, anew cluster must first be defined based only on the seismic data that representsproduction-log data with averaged values greater than 0.50. Averaged values aredetermined by assigning a value to each sample of a horizontal borehole (twofeet/sample). Sample intervals that are producing gas are assigned values of one andnon-producing sample intervals are assigned values of zero. The optimum windowsize for production-log data is three samples and the averaged value at any pointis the average of the samples in the surrounding window. After the new cluster isdetermined, a series of IF-THEN statements is used to define the Production Indices.

Three criteria have been used to select potential locations for infill drilling orrecompletion: 1. continuity of the selected cluster, 2. size and shape of the clus-ter, and 3. existence of high Production-Index values inside a selected cluster withhigh Cluster-Index values. Based on these criteria, locations of the new wells wereselected and two such locations are shown in Fig. 37B (Fig. 18), one with high con-tinuity and potential for high production and one with low continuity and potentialfor low production. The neighboring wells that are already in production confirmsuch a prediction as shown in Fig. 37B (Fig. 18).

Page 324: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 321

Fig. 32 Clusters one and four that correlate with production using the k-means clustering techniqueon the section passing through the wells as shown in Fig. 19

Fig. 33 Clusters one and six that correlate with production using the neural network clusteringtechnique on the section passing through the wells as shown in Fig. 19

Page 325: Forging New Frontiers: Fuzzy Pioneers I

322 M. Nikravesh

Fig. 34 Clusters six and ten that correlate with production using the fuzzy c-means clusteringtechnique on the section passing through the wells as shown in Fig. 19

Fig. 35 The section passing through all wells showing a typical distribution of clusters generatedby combining the three sets of 10 clusters created by each clustering technique (k-means, neuralnetwork, and fuzzy c-means). Note that eleven clusters are displayed rather than the ten clustersoriginally generated. The eleventh cluster is a subset of the three pairs of clusters that contain mostof the production

Page 326: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 323

Well Fh

Medium Production Well Eh

Low Production

Well Dh

Medium Production

Well Ch

Medium production

Well Bh

High Production

Well Ah

High Production

Well Fh

Medium Production Well Eh

Low Production

Well Dh

Medium Production

Well Ch

Medium production

Well Bh

High Production

Well Ah

High Production

Fig. 36 The section passing through all wells showing only the eleventh cluster, a subset of thethree pairs of clusters that contains most of the production

Po

tenti al fo

rL

ow

pro

du

c tion

Potential for High production

Pro

duc t

ion

Inde

xcudor

PxednI noit

xednI retsulC

xednI retsulC

Po

tenti al fo

rL

ow

pro

du

c tion

Potential for Hi h production

a

Fig. 37A A 2-D time slice showing clusters and production, with areas indicated for optimal wellplacement

Page 327: Forging New Frontiers: Fuzzy Pioneers I

324 M. Nikravesh

New Well Locations

Well Fh

Medium Production

Well Eh

Low Production

Well Dh

Medium Production

Well Bh

High Production

Well Ah

High Production

Well Ch

Medium Production

oPet

itn

afl or L

wo

pr

itcud

o

onP

otential forH

igh

production

Pr o

duct

ion

Inde

xxednI

retsulC

New Well Locations

Well Fh

Medium Production

Well Eh

Low Production

Well Dh

Medium Production

Well Bh

High Production

Well Ah

High Production

Well Ch

Medium Production

oPet

itn

afl or L

wo

pr

itcud

o

onP

otential forH

igh

production

rP

noitcudoe dnI

xxednI

retsulC

Fig. 37B An oblique view of the 2-D time slice in Fig. 20A showing clusters and production, withareas indicated for optimal well placement

9 Fractured Reservoir Characterization

In particular when we faced with fractured reservoir characterization, an efficientmethod of data entry, compiling, and preparation becomes important. Not only theinitial model requires considerable amount of data preparation, but also subsequentstages of model updating will require a convenient way to input the new data to theexisting data stream. Well logs suites provided by the operator will be supplied tothe project team. We anticipate a spectrum of resistivity, image logs, cutting andcore where available. A carefully designed data collection phase will provide thenecessary input to develop a 3-D model of the reservoir. An optimum number oftest wells and training wells needs to be identified. In addition, a new techniqueneeds to be developed to optimize the location and the orientation of each new wellto be drilled based on data gathered from previous wells. If possible, we want toprevent clustering of too many wells at some locations and under-sampling in otherlocations thus maintaining a level of randomness in data acquisition. The data to becollected will be dependent on the type of fractured reservoir

The data collected will also provide the statistics to establish the trends, vari-ograms, shape, and distribution of the fractures in order to develop a non-linear andnon-parametric statistical model and various possible realizations of this model. ForExample, one can use Stochastic models techniques and Alternative ConditionalExpectation (ACE) model developed by Breiman and Friedman [1985] for initialreservoir model prediction This provides crucial information on the variability ofthe estimated models. Significant changes from one realization to the other indicatea high level of uncertainty, thus the need for additional data to reduce the standard

Page 328: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 325

deviation. In addition, one can use our neuro-fuzzy approach to better quantify andperhaps reduce the uncertainties in the characterization of the reservoir.

Samples from well cuttings (commonly available) and cores (where available)from the focus area can also be analyzed semi-quantitatively by XRD analysis ofclay mineralogy to determine vertical variability. Calibration to image logs needs tobe performed to correlate fracture density to conventional log signature and miner-alogical analysis.

Based on the data obtained and the statistical representation of the data, an initial3-D model of the boundaries of the fractures and its distribution can be developed.The model is represented by a multi-valued parameter, which reflects different sub-surface properties to be characterized. This parameter is derived through integrationof all the input data using a number of conventional statistical approaches.

A novel “neuro-fuzzy” based algorithm that combines the training and learn-ing capabilities of the conventional neural networks with the capabilities of fuzzylogic to incorporate subjective and imprecise information can be refined for thisapplication. Nikravesh [1998a, b] showed the significant superiority of the neuro-fuzzy approach for data integration over the conventional methods for characterizingthe boundaries. Similar method with minor modifications can be implemented andtested for fractured reservoirs.

Based on this information, an initial estimate for distribution of reservoir prop-erties including fracture shape and distribution in 2-D and 3-D spaces can be pre-dicted. Finally, the reservoir model is used as an input to this step to develop anoptimum strategy for management of the reservoir. As data collection continues inthe observation wells, using new data the model parameters will be updated. Thesemodels are then continually evaluated and visualized to assess the effectiveness ofthe production strategy. The wells chosen in the data collection phase will be de-signed and operated through a combination of an intelligent advisor.

10 Future Trends and Conclusions

We have discussed the main areas where soft computing can make a major impactin geophysical, geological and reservoir engineering applications in the oil indus-try. These areas include facilitation of automation in data editing and data mining.We also pointed out applications in non-linear signal (geophysical and log data)processing. And better parameterization of wave equations with random or fuzzycoefficients both in seismic and other geophysical wave propagation equations andthose used in reservoir simulation. Of significant importance is their use in dataintegration and reservoir property estimation. Finally, quantification and reductionof uncertainty and confidence interval is possible by more comprehensive use offuzzy logic and neural networks. The true benefit of soft computing, which is touse the intelligent techniques in combination (hybrid) rather than isolation, has notbeen demonstrated in a full extent. This section will address two particular areas forfuture research: hybrid systems and computing with words.

Page 329: Forging New Frontiers: Fuzzy Pioneers I

326 M. Nikravesh

10.1 Hybrid Systems

So far we have seen the primary roles of neurocomputing, fuzzy logic and evolu-tionary computing. Their roles are in fact unique and complementary. Many hybridsystems can be built. For example, fuzzy logic can be used to combine results fromseveral neural networks; GAs can be used to optimize the number of fuzzy rules;linguistic variables can be used to improve the performance of GAs; and extract-ing fuzzy rules from trained neural networks. Although some hybrid systems havebeen built, this topic has not yet reached maturity and certainly requires more fieldstudies.

In order to make full use of soft computing for intelligent reservoir character-ization, it is important to note that the design and implementation of the hybridsystems should aim to improve prediction and its reliability. At the same time, theimproved systems should contain small number of sensitive user-definable modelparameters and use less CPU time. The future development of hybrid systems shouldincorporate various disciplinary knowledge of reservoir geoscience and maximizethe amount of useful information extracted between data types so that reliable ex-trapolation away from the wellbores could be obtained.

10.2 Computing with Words

One of the major difficulties in reservoir characterization is to devise a method-ology to integrate qualitative geological description. One simple example is thecore descriptions in standard core analysis. These descriptions provide useful andmeaningful observations about the geological properties of core samples. They mayserve to explain many geological phenomena in well logs, mud logs and petrophys-ical properties (porosity, permeability and fluid saturations). Yet, these details arenot utilized due to the lack of a suitable computational tool. Gedeon et al. (1999)provided one of the first attempts to relate these linguistic descriptions (grain size,sorting, matrix, roundness, bioturbation and lamina) to core porosity levels (verypoor, poor, fair and good) using intelligent techniques. The results were promisingand drawn a step closer to Zadeh’s idea on computing with words (Zadeh, 1996).

Computing with words (CW) aims to perform computing with objects whichare propositions drawn from a natural language or having the form of mental per-ceptions. In essence, it is inspired by remarkable human capability to manipulatewords and perceptions and perform a wide variety of physical and mental taskswithout any measurement and any computations. It is fundamentally different fromthe traditional expert systems which are simply tools to “realize” an intelligentsystem, but are not able to process natural language which is imprecise, uncertainand partially true. CW has gained much popularity in many engineering disciplines(Zadeh, 1999a and 1999b). In fact, CW plays a pivotal role in fuzzy logic and vice-versa. Another aspect of CW is that it also involves a fusion of natural languagesand computation with fuzzy variables.

Page 330: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 327

In reservoir geology, natural language has been playing a very crucial role for along time. We are faced with many intelligent statements and questions on a dailybasis. For example: “if the porosity is high then permeability is likely to be high”;“most seals are beneficial for hydrocarbon trapping, a seal is present in reservoirA, what is the probability that the seal in reservoir A is beneficial?”; and “highresolution log data is good, the new sonic log is of high resolution, what can be saidabout the goodness of the new sonic log?”

CW has much to offer in reservoir characterization because most available reser-voir data and information are too imprecise. There is a strong need to exploit thetolerance for such imprecision, which is the prime motivation for CW. Future re-search in this direction will surely provide a significant contribution in bridgingreservoir geology and reservoir engineering.

Given the level of interest and the number of useful networks developed for theearth science applications and specially oil industry, it is expected soft computingtechniques will play a key role in this field. Many commercial packages based onsoft computing are emerging. The challenge is how to explain or “sell” the conceptsand foundations of soft computing to the practicing explorationist and convincethem of the value of the validity, relevance and reliability of results based on theintelligent systems using soft computing methods.

Acknowledgments The author would like to thanks Dr. Fred Aminzadeh for his contribution toearlier version of this paper, his feedback and comments and allowing the authors to use the hispublished and his unpublished documents, papers, and presentations to prepare this paper.

References

1. Adams, R.D., J.W. Collister, D.D. Ekart, R.A. Levey, and M. Nikravesh, 1999a, Evaluationof gas reservoirs in collapsed paleocave systems, Ellenburger Group, Permian Basin, Texas,AAPG Annual Meeting, San Antonio, TX, 11–14 April.

2. Adams, R.D., M. Nikravesh, D.D. Ekart, J.W. Collister, R.A. Levey, and R.W. Siegfried,1999b, Optimization of Well Locations in Complex Carbonate Reservoirs, Summer 1999b,Vol. 5, Number 2, GasTIPS, GRI 1999b.

3. Aminzadeh, F. and Jamshidi, M.: Soft Computing: Fuzzy Logic, Neural Networks, and Dis-tributed Artificial Intelligence, PTR Prentice Hall, Englewood Cliffs, NJ (1994).

4. Adams, R.D., Nikravesh, M., Ekart, D.D., Collister, J.W., Levey, R.A. and Seigfried, R.W.:“Optimization of Well locations in Complex Carbonate Reservoirs,” GasTIPS, Vol. 5, no. 2,GRI (1999).

5. Aminzadeh, F. 1989, Future of Expert Systems in the Oil Industry, Proceedings of HARCConference on Geophysics and Computers, A Look at the Future.

6. Aminzadeh, F. and Jamshidi,M., 1995, Soft Computing, Eds., PTR Prentice Hall,7. Aminzadeh, F. and S. Chatterjee, 1984/85, Applications of clustering in exploration seismol-

ogy, Geoexploration, v23, p.147–159.8. Aminzadeh, F., 1989, Pattern Recognition and Image Processing, Elsevier Science Ltd. Am-

inzadeh, F., 1994, Applications of Fuzzy Expert Systems in Integrated Oil Exploration, Com-puters and Electrical Engineering, No. 2, pp 89–97.

9. Aminzadeh, F., 1991, Where are we now and where are we going?, in Expert Systems inExploration, Aminzadeh, F. and M. Simaan, Eds., SEG publications, Tulsa, 3–32.

Page 331: Forging New Frontiers: Fuzzy Pioneers I

328 M. Nikravesh

10. Aminzadeh, F., S. Katz, and K. Aki, 1994, Adaptive Neural Network for Generation of Arti-ficial Earthquake Precursors, IEEE Trans. On Geoscience and Remote Sensing, 32 (6)

11. Aminzadeh, F. 1996, Future Geoscience Technology Trends in , Stratigraphic Analysis, Uti-lizing Advanced Geophysical, Wireline, and Borehole Technology For Petroleum Explorationand Production, GCSEPFM pp 1–6,

12. Aminzadeh, F. Barhen, J. Glover, C. W. and Toomanian, N. B , 1999, Estimation of ReservoirParameters Using a Hybrid Neural Network, Journal of Science and Petroleum Engineering,Vol. 24, pp 49–56.

13. Aminzadeh, F. Barhen, J. Glover, C. W. and Toomanian, N. B., 2000, Reservoir parameterestimation using hybrid neural network, Computers & Geoscience, Vol. 26 pp 869–875.

14. Aminzadeh, F., and de Groot, P., Seismic characters and seismic attributes to predict reservoirproperties, 2001, Proceedings of SEG-GSH Spring Symposium.

15. Baldwin, J.L., A.R.M. Bateman, and C.L. Wheatley, 1990, Application of Neural Network tothe Problem of Mineral Identification from Well Logs, The Log Analysis, 3, 279.

16. Baldwin, J.L., D.N. Otte, and C.L. Wheatley, 1989, Computer Emulation of human mentalprocess: Application of Neural Network Simulations to Problems in Well Log Interpretation,Society of Petroleum Engineers, SPE Paper # 19619, 481.

17. Baygun et. al, 1985, Applications of Neural Networks and Fuzzy Logic in GeologicalModeling of a Mature Hydrocarbon Reservoir, Report# ISD-005-95-13a, Schlumberger-DollResearch.

18. Benediktsson, J. A. Swain, P. H., and Erson, O. K., 1990, Neural networks approaches versusstatistical methods in classification of multisource remote sensing data, IEEE Geoscience andRemote Sensing, Transactions, Vol. 28, No. 4, 540–552.

19. Bezdek, J.C., 1981, Pattern Recognition with Fuzzy Objective Function Algorithm, PlenumPress, New York.

20. Bezdek, J.C., and S.K. Pal,eds., 1992, Fuzzy Models for Pattern Recognition: IEEE Press,539p.

21. Boadu, F.K., 1997, Rock properties and seismic attenuation: Neural network analysis, PureApplied Geophysics, v149, pp. 507–524.

22. Bezdek, J.C., Ehrlich, R. and Full, W.: The Fuzzy C-Means Clustering Algorithm,” Comput-ers and Geosciences (1984) 10, 191–203.

23. Bois, P., 1983, Some Applications of Pattern Recognition to Oil and Gas Exploration, IEEETransaction on Geoscience and Remote Sensing, Vol. GE-21, No. 4, PP. 687–701.

24. Bois, P., 1984, Fuzzy Seismic Interpretation, IEEE Transactions on Geoscience and RemoteSensing, Vol. GE-22, No. 6, pp. 692–697.

25. Breiman, L., and J.H. Friedman, Journal of the American Statistical Association, 580, 1985.26. Cannon, R. L., Dave J. V. and Bezdek, J. C., 1986, Efficient implementation of Fuzzy

C- Means algorithms, IEEE Trans. On Pattern Analysis and Machine Intelligence, Vol. PAMI8, No. 2.

27. Bruce, A.G., Wong, P.M., Tunggal, Widarsono, B. and Soedarmo, E.: “Use of ArtificialNeural Networks to Estimate Well Permeability Profiles in Sumatera, Indonesia,” Proceed-ings of the 27th Annual Conference and Exhibition of the Indonesia Petroleum Association(1999).

28. Bruce, A.G., Wong, P.M., Zhang, Y., Salisch, H.A., Fung, C.C. and Gedeon, T.D.: “A State-of-the-Art Review of Neural Networks for Permeability Prediction,” APPEA Journal (2000),in press.

29. Caers, J.: “Towards a Pattern Recognition-Based Geostatistics,” Stanford Center for ReservoirForecasting Annual Meeting (1999).

30. Caers, J. and Journel, A.G.: “Stochastic Reservoir Simulation using Neural Networks Trainedon Outcrop Data,” SPE 49026, SPE Annual Technical Conference and Exhibition, New Or-leans (1998), 321–336.

31. Chawathe, A., A. Quenes, and W.W. Weiss, 1997, Interwell property mapping using cross-well seismic attributes, SPE 38747, SPE Annual Technical Conference and Exhibition, SanAntonio, TX, 5–8 Oct.

Page 332: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 329

32. Chen, H. C. ,Fang, J. H. ,Kortright, M. E. ,Chen, D. C Novel Approaches to the Determinationof Archie Parameters II: Fuzzy Regression Analysis,. SPE 26288-P

33. Cybenko, G., 1989, Approximation by superposition of a sigmoidal function, Math. ControlSig. System, v2, p. 303.

34. Cuddy, S.: “The Application of Mathematics of Fuzzy Logic to Petrophysics,” SPWLA An-nual Logging Symposium (1997), paper S.

35. Bloch, S.: “Empirical Prediction of Porosity and Permeability in Sandstones,” AAPG Bulletin(1991) 75, 1145–1160.

36. Fang, J.H., Karr, C.L. and Stanley, D.A.: “Genetic Algorithm and Application to Petro-physics,” SPE 26208, unsolicited paper.

37. Fang, J.H., Karr, C.L. and Stanley, D.A.: “Transformation of Geochemical Log Data to Min-eralogy using Genetic Algorithms,” The Log Analyst (1996) 37, 26–31.

38. Fang, J.H. and Chen, H.C.: “Fuzzy Modelling and the Prediction of Porosity and Permeabilityfrom the Compositional and Textural Attributes of Sandstone,” Journal of Petroleum Geology(1997) 20, 185–204.

39. Gallagher, K. and Sambridge, M.: “Genetic Algorithms: A Powerful Tool for Large-ScaleNonlinear Optimization Problems,” Computers and Geosciences (1994) 20, 1229–1236.

40. Gedeon, T.D., Tamhane, D., Lin, T. and Wong, P.M.: “Use of Linguistic Petrographical De-scriptions to Characterise Core Porosity: Contrasting Approaches,” submitted to Journal ofPetroleum Science and Engineering (1999).

41. Gupta, M. M. and H. Ding, 1994, Foundations of fuzzy neural computations, in Soft Com-puting, Aminzadeh, F. and Jamshidi,M., Eds., PTR Prentice Hall, 165–199.

42. Hecht-Nielsen, R., 1989, Theory of backpropagation neural networks, presented at IEEEProc., Int. Conf. Neural Network, Washington DC.

43. Hecht-Nielsen, R., 1990, Neurocomputing: Addison-Wesley Publishing, p. 433.Jang, J.S.R.,1992, Self-learning fuzzy controllers based on temporal backpropagation, IEEE Trans. NeuralNetworks, 3 (5).

44. Holland, J.: Adaptation in Natural and Artificial Systems, University of Michigan Press, AnnHarbor (1975).

45. Horikawa, S, T. Furuhashi, A. Kuromiya, M. Yamaoka, and Y. Uchikawa, Determination ofAntecedent Structure for Fuzzy Modeling Using Genetic Algorithm, Proc. ICEC’96, IEEEInternational onference on Evolutionary Computation, Nagoya, Japan, May 20–22, 1996.

46. Horikawa, S, T. Furuhashi, and Y. Uchikawa, On Fuzzy Modeling Using Fuzzy NeuralNetworks with the Backpropagation Algorithm, IEEE Trans. On Neural Networks, 3 (5),1992.

47. Horikawa, S, T. Furuhashi, Y. Uchikawa, and T. Tagawa, “A study on Fuzzy Modeling UsingFuzzy Neural Networks”, Fuzzy Engineering toward human Friendly Systems, IFES’91.

48. Huang, Z. and Williamson, M. A., 1994, Geological pattern recognition and modeling with ageneral regression neural network, Canadian Jornal of Exploraion Geophysics, Vol. 30, No. 1,60–68.

49. Huang, Y., Gedeon, T.D. and Wong, P.M.: “A Practical Fuzzy Interpolator for Prediction ofReservoir Permeability,” Proceedings of IEEE International Conference on Fuzzy Systems,Seoul (1999).

50. Huang, Y., Wong, P.M. and Gedeon, T.D.: “Prediction of Reservoir Permeability using Ge-netic Algorithms,” AI Applications (1998) 12, 67–75.

51. Huang, Y., Wong, P.M. and Gedeon, T.D.: “Permeability Prediction in Petroleum Reservoirusing a Hybrid System,” In: Soft Computing in Industrial Applications, Suzuki, Roy, Ovaska,Furuhashi and Dote (eds.), Springer-Verlag, London (2000), in press.

52. Huang, Y., Wong, P.M. and Gedeon, T.D.: “Neural-fuzzy-genetic Algorithm Interpolator inLog Analysis,” EAGE Conference and Technical Exhibition, Leipig (1998), paper P106.

53. Jang, J.S.R., and N. Gulley, 1995, Fuzzy Logic Toolbox, The Math Works Inc., Natick, MA.54. Jang, J.S.R., 1992, Self-learning fuzzy controllers based on temporal backpropagation, IEEE

Trans. Neural Networks, 3 (5).55. Jang, J.S.R., 1991, Fuzzy modeling using generalized neural networks and Kalman filter al-

gorithm, Proc. of the Ninth National Conference on Artificial Intelligence, pp. 762–767.

Page 333: Forging New Frontiers: Fuzzy Pioneers I

330 M. Nikravesh

56. Jang, J.-S.R., Sun, C.-T. and Mizutani, E.: Neuro-Fuzzy and Soft Computing, Prentice-HallInternational Inc., NJ (1997).

57. Johnson, V. M. and Rogers, L. L., Location analysis in ground water remediation using artifi-cial neural networks, Lawrence Livermore National Laboratory Report, UCRL-JC-17289.

58. Klimentos, T., and C. McCann, 1990, Relationship among Compressional Wave Attenuation,Porosity, Clay Content and Permeability in Sandstones, Geophys., v55, p. 991014.

59. Kohonen, T., 1997, Self-Organizing Maps, Second Edition, Springer.Berlin.60. Kohonen, T., 1987, Self-Organization and Associate Memory, 2nd Edition, Springer Verlag.,

Berlin.61. Lashgari, B, 1991, Fuzzy Classification with Applications, in Expert Systems in Exploration,

Aminzadeh, F. and M. Simaan, Eds., SEG publications, Tulsa, 3–32.62. Levey, R., M. Nikravesh, R. Adams, D. Ekart, Y. Livnat, S. Snelgrove, J. Collister, 1999, Eval-

uation of fractured and paleocave carbonate reservoirs, AAPG Annual Meeting, San Antonio,TX, 11–14 April.

63. Lin, C.T., and Lee, C.S.G., 1996, Neural fuzzy systems, Prentice Hall, Englewood Cliffs, NewJersey.

64. Lippmann, R. P., 1987, An introduction to computing with neural networks, ASSP Magazine,April, 4–22.

65. Liu, X., Xue, P., Li, Y., 1989, Neural networl method for tracing seismic events, 59th annualSEG meeting, Expanded Abstracts, 716–718.

66. MacQueen, J., 1967, Some methods for classification and analysis of multivariate observa-tion: in LeCun, L. M. and Neyman, J., eds., The Proceedings of the 5th Berkeley Symposiumon Mathematical Statistics and Probability: University of California Press, v.1, 281–297.

67. Mallick, S., 1999, Some practical aspects of prestack waveform inversion using a geneticalgorithm: An example from the east Texas Woodbine gas sand: Geophysics, Soc. of Expl.Geophys., 64, 26–336

68. Monson, G.D. and Pita, J.A.: “Neural Network Prediction of Pseudo-logs for Net Pay andReservoir Property Interpretation: Greater Zafiro Field Area, Equatorial Guinea,” SEG An-nual Meeting (1997).

69. Matsushita, S., A. Kuromiya, M. Yamaoka, T. Furuhashi, and Y. Uchikawa, “A study onFuzzy GMDH with Comprehensible Fuzzy Rules”, 1994 IEEE Symposium on EmergingTehnologies and Factory Automation.Gaussian Control Problem, Mathematics of operationsresearch, 8 (1) 1983.

70. The Math WorksTM1995, Natick.71. McCormack, M. D., 1990, Neural computing in geophysics, Geophysics: The Leading Edge

of Exploration, 10, 1, 11–15.72. McCormack, M. D., 1991, Seismic trace editing and first break piking using neural networks,

60th annual SEG meeting, Expanded Abstracts, 321–324.73. McCormack, M. D., Stoisits, R. F., Macallister, D. J. and Crawford, K. D., 1999, Applications

of genetic algorithms in exploration and production‘: The Leading Edge, 18, no. 6, 716–718.74. Medsker, L.R., 1994, Hybrid Neural Network and Expert Systems: Kluwer Academic Pub-

lishers, p. 240.75. Monson G.D., and J.A. Pita, 1997, Neural network prediction of pseudo-logs for net pay

and reservoir property interpretation: Greater Zafiro field area, Equatorial Guinea, SEG 1997meeting, Dallas, TX.

76. Nikravesh, M, and F. Aminzadeh, 1997, Knowledge discovery from data bases: Intelligentdata mining, FACT Inc. and LBNL Proposal, submitted to SBIR-NASA.

77. Nikravesh, M. and F. Aminzadeh, 1999, Mining and Fusion of Petroleum Data with FuzzyLogic and Neural Network Agents, To be published in Special Issue, Computer and Geo-science Journal, 1999–2000.

78. Nikravesh, M., 1998a, “Mining and fusion of petroleum data with fuzzy logic and neuralnetwork agents”, CopyRight c© Report, LBNL-DOE, ORNL-DOE, and DeepLook IndustryConsortium.

79. Nikravesh, M., 1998b, “Neural network knowledge-based modeling of rock properties basedon well log databases,” SPE 46206, 1998 SPE Western Regional Meeting, Bakersfield,California, 10–13 May.

Page 334: Forging New Frontiers: Fuzzy Pioneers I

Computational Intelligence for Geosciences and Oil Exploration 331

80. Nikravesh, M., A.E. Farell, and T.G. Stanford, 1996, Model Identification of Nonlinear Time-Variant Processes Via Artificial Neural Network, Computers and Chemical Engineering, v20,No. 11, 1277p.

81. Nikravesh, M., B. Novak, and F. Aminzadeh, 1998, Data mining and fusion with integratedneuro-fuzzy agents: Rock properties and seismic attenuation, JCIS 1998, The Fourth JointConference on Information Sciences, North Carolina, USA, October 23–2.

82. Nikravesh, M., Adams, R.D., Levey, R.A. and Ekart, D.D.: “Soft Computing: Tools for Intel-ligent Reservoir Characterization (IRESC) and Optimum Well Placement (OWP),” ?????.

83. Nikravesh, M. and Aminzadeh, F.: “Opportunities to Apply Intelligent Reservoir Characteri-zations to the Oil Fields in Iran: Tutorial and Lecture Notes for Intelligent Reservoir Charac-terization,” unpublished.

84. Nikravesh, M., Kovscek, A.R., Murer, A.S. and Patzek, T.W.: “Neural Networks for Field-Wise Waterflood Management in Low Permeability, Fractured Oil Reservoirs,” SPE 35721,SPE Western Regional Meeting, Anchorage (1996), 681–694.

85. Nikravesh, M.: “Artificial Intelligence Techniques for Data Analysis: Integration of Model-ing, Fusion and Mining, Uncertainty Analysis, and Knowledge Discovery Techniques,” 1998BISC-SIG-ES Workshop, Lawrence Berkeley National Laboratory, University of California,Berkeley (1998).

86. Nikravesh, M., R.A. Levey, and D.D. Ekart, 1999, Soft Computing: Tools for Reservoir Char-acterization (IRESC), To be presented at 1999 SEG Annual Meeting.

87. Nikravesh, M. and Aminzadeh, F., 2001, Mining and Fusion of Petroleum Data with FuzzyLogic and Neural Network Agents, Journal of Petroleum Science and Engineering, Vol. 29,pp 221–238,

88. Nordlund, U.: “Formalizing Geological Knowledge – With an Example of Modeling Stratig-raphy using Fuzzy Logic,” Journal of Sedimentary Research (1996) 66, 689–698.

89. Pezeshk, S, C.C. Camp, and S. Karprapu, 1996, Geophysical log interpretation using neuralnetwork, Journal of Computing in Civil Engineering, v10, 136p.

90. Rogers, S.J., J.H. Fang, C.L. Karr, and D.A. Stanley, 1992, Determination of Lithology, fromWell Logs Using a Neural Network, AAPG Bulletin, v76, 731p.

91. Potter, D., Wright, J. and Corbett, P., 1999, A genetic petrophysics approach to facies andpermeability prediction in a Pegasus well, 61st Mtg.: Eur. Assn. Geosci. Eng., Session:2053.

92. Rosenblatt, F., 1962 Principal of neurodynamics, Spartan Books.93. Schuelke, J.S., J.A. Quirein, J.F. Sarg, D.A. Altany, and P.E. Hunt, 1997, Reservoir ar-

chitecture and porosity distribution, Pegasus field, West Texas-An integrated sequencestratigraphic-seismic attribute study using neural networks, SEG 1997 meeting, Dallas,TX.

94. Selva, C., Aminzadeh, F., Diaz B., and Porras J. M., 200, Using Geostatistical Techniques ForMapping A Reservoir In Eastern Venezuela Proceedings of 7th INTERNATIONAL CONGRESS OF THE

BRAZILIAN GEOPHYSICAL SOCIETY, OCTOBER 2001, SALVADOR, BRAZIL

95. Sheriff, R. E., and Geldart, L. P., 1982, Exploration Seismology, Vol. 1 and 1983, Vol. 2,Cambridge

96. Sugeno, M. and T. Yasukawa, 1993, A Fuzzy-Logic-Based Approach to Qualitative Model-ing, IEEE Trans. Fuzzy Syst., 1

97. Tamhane, D., Wang, L. and Wong, P.M.: “The Role of Geology in Stochastic ReservoirModelling: The Future Trends,” SPE 54307, SPE Asia Pacific Oil and Gas Conference andExhibition, Jakarta (1999).

98. Taner, M. T., Koehler, F. and Sherrif, R. E., 1979, Complex seismic trace analysis, Geo-physics, 44, 1196–1212Tamhane, D., Wong, P. M.,Aminzadeh, F. 2002, Integrating Linguis-tic Descriptions and Digital Signals in Petroleum Reservoirs, International Journal of FuzzySystems, Vol. 4, No. 1, pp 586–591

99. Tura, A., and Aminzadeh, F., 1999, Dynamic Reservoir Characterization and SeismicallyConstrained Production Optimization, Extended Abstracts of Society of Exploration Geo-physicist, Houston,

100. Veezhinathan, J., Wagner D., and Ehlers, J. 1991, First break picking using neural network,in Expert Systems in Exploration, Aminzadeh, F. and Simaan, M., Eds, SEG, Tulsa, 179–202.

Page 335: Forging New Frontiers: Fuzzy Pioneers I

332 M. Nikravesh

101. Wang, L. and Wong, P.M.: “A Systematic Reservoir Study of the Lower Cretaceous in AnanOilfield, Erlian Basin, North China,” AAPG Annual Meeting (1998).

102. Wang, L., Wong, P.M. and Shibli, S.A.R.: “Integrating Geological Quantification,Neural Networks and Geostatistics for Reservoir Mapping: A Case Study in theA’nan Oilfield, China,” SPE Reservoir Evaluation and Engineering (1999) 2(6),in press.

103. Wang, L., Wong, P.M., Kanevski, M. and Gedeon, T.D.: “Combining Neural Networks withKriging for Stochastic Reservoir Modeling,” In Situ (1999) 23(2), 151–169.

104. Wong, P.M., Henderson, D.H. and Brooks, L.J.: “Permeability Determination using NeuralNetworks in the Ravva Field, Offshore India,” SPE Reservoir Evaluation and Engineering(1998) 1, 99–104.

105. Wong, P.M., Jang, M., Cho, S. and Gedeon, T.D.: “Multiple Permeability Predictions usingan Observational Learning Algorithm,” Computers and Geosciences (2000), in press.

106. Wong, P.M., Tamhane, D. and Wang, L.: “A Neural Network Approach to Knowledge-BasedWell Interpolation: A Case Study of a Fluvial Sandstone Reservoir,” Journal of PetroleumGeology (1997) 20, 363–372.

107. Wong, P.M., F.X. Jiang and, I.J. Taggart, 1995a, A Critical Comparison of Neural Networksand Discrimination Analysis in Lithofacies, Porosity and Permeability Prediction, Journal ofPetroleum Geology, v18, 191p.

108. Wong, P.M., T.D. Gedeon, and I.J. Taggart, 1995b, An Improved Technique in Prediction:A Neural Network Approach, IEEE Transaction on Geoscience and Remote Sensing, v33,971p.

109. Wong, P. M., Aminzadeh, F and Nikravesh, M., 2001, Soft Computing for Reservoir Char-acterization, in Studies in Fuzziness, Physica Verlag, Germany

110. Yao, X.: “Evolving Artificial Neural Networks,” Proceedings of IEEE (1999) 87, 1423–1447.111. Yashikawa, T., Furuhashi, T. and Nikravesh, M.: “Development of an Integrated Neuro-

Fuzzy-Genetic Algorithm Software,” 1998 BISC-SIG-ES Workshop, Lawrence Berkeley Na-tional Laboratory, University of California, Berkeley (1998).

112. Yoshioka, K, N. Shimada, and Y. Ishii, 1996, Application of neural networks and co-krigingfor predicting reservoir porosity-thickness, GeoArabia, v1, No 3.

113. Zadeh L.A. and Yager R.R.” Uncertainty in Knowledge Bases”’ Springer-Verlag, Berlin,1991.11Zadeh L.A., “The Calculus of Fuzzy if-then-Rules”, AI Expert 7 (3), 1992.

114. Zadeh L.A., “The Roles of Fuzzy Logic and Soft Computing in the Concept, Design anddeployment of Intelligent Systems”, BT Technol Journal, 14 (4) 1996.

115. Zadeh, L. A. and Aminzadeh, F.( 1995) Soft computing in integrated exploration, proceed-ings of IUGG/SEG symposium on AI in Geophysics, Denver, July 12.

116. Zadeh, L.A., 1965, Fuzzy sets, Information and Control, v8, 33353.117. Zadeh, L.A., 1973, Outline of a new approach to the analysis of complex systems and deci-

sion processes: IEEE Transactions on Systems, Man, and Cybernetics, v. SMC-3, 244118. Zadeh, L.A.: “Fuzzy Logic, Neural Networks, and Soft Computing,” Communications of the

ACM (1994) 37(3), 77–84.119. Zadeh, L. and Kacprzyk, J. (eds.): Computing With Words in Information/Intelligent Systems

1: Foundations, Physica-Verlag, Germany (1999).120. Zadeh, L. and Kacprzyk, J. (eds.): Computing With Words in Information/Intelligent Systems

2: Applications, Physica-Verlag, Germany (1999).121. Zadeh, L.A.: “Fuzzy Logic = Computing with Words,” IEEE Trans. on Fuzzy Systems (1996)

4, 103–111.122. Bishop, C.: Neural Networks for Pattern Recognition, Oxford University Press, NY (1995).123. von Altrock, C., 1995, Fuzzy Logic and NeuroFuzzy Applications Explained: Prentice Hall

PTR, 350p.

Page 336: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of RemoteSensing Data

Yan Wang, Mo Jamshidi, Paul Neville, Chandra Bales and Stan Morain

Abstract Land covers mix and high input dimension are two important issues toaffect the classification accuracy of remote sensing images. Fuzzy classification hasbeen developed to represent the mixture of land covers. Two fuzzy classifiers ofFuzzy Rule-Based (FRB) and Fuzzy Neural Network (FNN) were studied to illus-trate the interpretability of fuzzy classification. A hierarchical structure was pro-posed to simply multi-class classification to multiple binary classification to reducecomputation time caused by high number of inputs. The classifiers were comparedon the land cover classification of a Landsat 7 ETM+ image over Rio Rancho,New Mexico, and it was proved that Hierarchical Fuzzy Neural Network (HFNN)classifier is the best combination of better classification accuracy with shorter CPUtime requirement.

1 Introduction

Remotely-sensed imagery classification involves the grouping of image data intoa finite number of discrete classes. The Maximum Likelihood Classifier (MLC)method, widely used in remote sensing classification, is based on normal data dis-tribution assumption. However, geo-spatial phenomena do not occur randomly innature and frequently are not displayed in the image data with a normal distribution.Neural Network (NN) classifiers, without such data distribution requirement [1],

Yan WangIntelligent Inference Systems Corp., NASA Research Park, MS: 566-109, Moffett Field, CA,94035e-mail: [email protected]

Mo JamshidiDepartment of Electrical and Computer Engineering and ACE Center, University of Texas atSan Antonio, San Antonio, TX, 78924e-mail: [email protected]

Paul Neville · Chandra Bales · Stan MorainThe Earth Data Analysis Center (EDAC), University of New Mexico, Albuquerque, New Mexico,87131e-mail: {pneville, cbales, smorain}@edac.unm.edu

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 333C© Springer 2007

Page 337: Forging New Frontiers: Fuzzy Pioneers I

334 Y. Wang et al.

have increasingly been used in remote sensing classification. In the remote sensingdata classification, multiple classes are identified, and each class is represented by avariety of patterns to reflect the natural variability. The neural network works basedon training data and learning algorithms, which cannot be interpreted by humanlanguage. Normally, it takes longer training and/or classification time for the neuralnetwork classification compared to the Maximum Likelihood Classifier [1].

In remote sensing images, a pixel might represent a mixture of class covers,within-class variability, or other complex surface cover patterns that cannot be prop-erly described by one class. These may be caused by the ground characteristics ofthe classes and the image spatial resolution. Since one class cannot uniquely de-scribe these pixels, fuzzy classification [2, 3] has been developed contrast to thetraditional classification, where a pixel does or does not belong to a class. In thefuzzy classification, each pixel belongs to a class with a certain degree of mem-bership and the sum of all class degrees is 1. A fuzzy sets approach [2, 4, 5] toimage classification makes no assumption about statistical distribution of the dataand so reduces classification inaccuracies. It allows for the mapping of a scene’snatural fuzziness or imprecision [2], and provides more complete information for athorough image analysis. Fuzzy c-means algorithm [6], as an unsupervised method,is widely used in the fuzzy classification and outputs membership values. Fuzzyk-Nearest Neighbour (kNN) [7] and fuzzy MLC [8] algorithms have been appliedto improve the classification accuracy. Fuzzy Rule-Based (FRB) classifiers are usedby [9, 10] for multispectral images with specific membership functions. Based onhuman knowledge, fuzzy expert systems are widely used in control systems [11]with general membership functions and interpretative ability and will be applied onthe land cover classification.

There are two common ways to generate rules in the fuzzy expert system, expertknowledge and training data. Given with the natural variability and complicatedpatterns in the remote sensing data, it is difficult to incorporate complete fuzzy rulesfrom expert knowledge to the classification system. The training data is desired toobtain the rules, but there is no learning process to adapt to the patterns. FuzzyNeural Network (FNN) classifiers, which could combine the learning capability ofneural networks in the fuzzy classification, have been applied in the remote sensingdata classification. Fuzzy classification has been applied in neural networks to relatetheir outputs to the class contribution in a given pixel [12, 13]. The combination offuzzy-c-means and neural networks [14, 15] has attracted the attention again. Apossibility-based fuzzy neural network is proposed in [16], as the extension of atypical multiplayer neural network.

High input dimension is a crucial problem in the land cover classification sinceit can cause the classification complexity and the computation time unacceptable.Different feature extraction methods, such as Principle Component Analysis (PCA)[17], DBFE (Decision Boundary Features Extraction) [18], genetic algorithms [7],etc. have been applied to preprocess the data. We will propose a hierarchicalstructure to simplify multi-class classification to multiple binary class classifica-tion so that the computation time is not increased even with high number of inputbands. Specially, the fuzzy neural network classifier with the hierarchical structureis proposed as Hierarchical Fuzzy Neural Network (HFNN) classifier. The expert

Page 338: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 335

knowledge could be not only used to generate the fuzzy rules, but also used to buildup the hierarchical structure.

In this paper, the land cover classification of a Landsat 7 Enhanced ThematicMapper Plus (ETM+) image over Rio Rancho, New Mexico is studied. SectionII gives the Landsat 7 ETM+ data set. Section III describes the Neural Network(NN), Fuzzy Rule-Based (FRB) and Fuzzy Neural Network (FNN) classifier. Sec-tion IV illustrates the hierarchical structure and the Hierarchical Fuzzy neural Net-work (HFNN) classifier. Section V presents the fuzzy classification and hierarchicalfuzzy classification results on the Landsat 7 EMT+ image. The classification per-formance is evaluated by using overall, average, individual producer accuracy, andkappa coefficient (Foody, 1992). Section VI concludes with a summary.

2 Experimental Setup

2.1 Landsat 7 ETM+ Data

The Landsat 7 ETM+ instrument is a nadir-viewing, multispectral scanning ra-diometer, which provides image data for the Earth’s surface via eight spectral bands(NASA, 2000; USGS Landsat 7, 2000). These bands range from the visible and nearinfrared (VNIR), the mid-infrared (Mid_IR), and the thermal infrared (TIR) regionsof the electromagnetic spectrum (Table 1). The image is initially obtained as a Level1G data product through pixel reformatting, radiometric correction, and geometriccorrection (Bales, 2001). Data are quantized at 8 bits. The image (Fig. 1) used inthe study is acquired over the Rio Rancho, New Mexico and is 744 lines× 1014columns (or 754,416 pixels total) for each band. Nine types of land cover are iden-tified in this area - water (WT), urban imperious (UI), irrigated vegetation (IV), bar-ren (BR), caliche-barren (CB), bosque/riparian forest (BQ), shrubland (SB), naturalgrassland (NG), and juniper savanna (JS). Urban areas, with their mix of buildings,streets and vegetation are an example of high degree of class mixture. Similarly, theshrubland, natural grassland and juniper savanna classes are highly mixed too.

In addition, two non-spectral bands are considered. NDVI (Normalized Dif-ference Vegetation Index, TM9) is used to discriminate between the land cover’s

Table 1 Landsat 7 ETM+ bands, spectral ranges, and ground resolutions

Band Number Spectral Range (μm) Ground Resolution (m)

TM1 (Vis-Blue) 0.450 – 0.515 30TM2 (Vis-Green) 0.525 – 0.605 30TM3 (Vis-Red) 0.630 – 0.690 30TM4 (NIR) 0.750 – 0.900 30TM5 (Mid-IR) 1.550 – 1.750 30TM6 (TIR) 10.40 – 12.50 60TM7 (Mid-IR) 2.090 – 2.350 30TM8 (Pan) 0.520 – 0.900 15

Page 339: Forging New Frontiers: Fuzzy Pioneers I

336 Y. Wang et al.

Fig. 1 ETM+ image with 3 bands TM1, TM4, TM7 displayed in blue, green, and red, respectively;training and testing areas displayed in red and yellow squares respectively

vegetation responses. A scaled NDVI [19] for display is computed by:

Scaled N DV I = 100∗[(T M4 − T M3)/(T M4+ T M3)+ 1]. (1)

In (1), TM4 is the near-infrared band and TM3 is the visible red band with valuesgreater than 100 indicating an increasing vegetation response, and lower values (asthey approach 0) indicating an increasing soil response. DEM (Digital ElevationModel, TM10) could be used to discriminate between some map units such as thejuniper savanna, which is found at higher elevation, and the bosque/ riparian for-est, which is found at lower elevations. These two band images are displayed inFig. 2.

2.2 Regions of Interests (ROIs) and Signature Data

Region of Interests (ROIs) are groups of image pixels which represent known classlabels or called ground-truth data. First, sixty-nine sampling data are collected basedon information gathered in the field, using a GPS to record the locations and the mapunit that each sample is categorized as. Second, the sampling data are extended tosixty-nine homogenous regions using a region growing method by ERDAS IMAG-INE [20]. In the region growing, a distance and a maximum number of pixels are setfor the polygon (or linear) region, and from the seeds (field samples) the continuouspixels within the predefined spectral distance are included in the regions of interests.

Page 340: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 337

(a) (b)

Fig. 2 (a) Scaled Normalized Difference Vegetation Index (TM9) image (b) Digital ElevationModel (TM10) image

From these seed polygons, basic descriptive statistics are gathered from each of thepixels in the seed polygons for each of the bands, which are signature data. Thesignature mean is plotted in Fig. 3. From the plot, we can see that some classes havevery similar statistics, such as natural grassland and shrubland or barren and caliche-barren, which are hard to be separated for the maximum likelihood classifier, basedon the data statistics. In total, 9,968 ground-truth data points are collected from thesixty-nine regions, of which 37 areas (4,901 data points) are randomly selected tobe used as the training data and the other 32 areas (5,068 data points) are used asthe testing data. Their distribution for each class is listed in Table 2.

TM1 TM2 TM3 TM4 TM5 TM6 TM7 TM8 TM9 TM100

50

100

150

200

250

300

Bands

Mea

n V

alue

s

Caliche-Barren Barren

Urban Imperious

Juniper Savanna

Natural Grassland

Shrubland

Irrigated Vegetation

Water

Bosque

Fig. 3 Signature mean

Page 341: Forging New Frontiers: Fuzzy Pioneers I

338 Y. Wang et al.

Table 2 Training and testing data (# data points / # areas) distribution

Class Training Data (4901/37) Testing Data (5068/32)

Water 265/3 303/2Urban Imperious 977/6 1394/6Irrigated Vegetation 709/4 601/4Barren 1729/8 1746/7Caliche-Barren 133/2 70/1Bosque 124/2 74/1Shrubland 470/5 453/5Natural Grassland 229/4 263/3Juniper Savanna 265/3 164/3

3 Classification Methods

3.1 Neural Network (NN) Classifiers

One of the most popular multilayer neural networks is the Back-Propagation NeuralNetwork (BPNN), which is based on gradient descent in error and is also known asthe back-propagation algorithm or generalized delta rule. Figure 4 shows a simplethree-layer neural network consisting of an input layer, a hidden layer, and an outputlayer. More hidden layers can be used when we need special problem conditions orrequirements. There are two operations in the learning process: feedforward andback propagation [21]. The number of input units is determined by the numberof features n, and the number of output units corresponds to the number of cate-gories, c. The target output of the training data (x, ωi ) is coded as a unit vectort = (t1, t2, . . . , tc)T with ti = 1 and t j = 0 (where i , j = 1, 2, . . . , c, and j �= i ),where ωi is the category the input vector x = (x1, x2, . . . , xn)

T belongs to. Theclassification output ω of the input x is determined by the maximum element of theoutput z = (z1, z2, . . . , zc)

T .

Input Layer Hidden Layer Output Layer

x1

x2

xn

z1

z2

zc

input x output z

t1

tk

tc

target t

y1

yj

ynH

hidden y

Fig. 4 Three-layer feedforward networks

Page 342: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 339

3.2 Fuzzy Rule-Based (FRB) Classifiers

Fuzzy expert system is an automatic system that is capable of mimicking humanactions for a specific task. There are three main operations in a fuzzy expert sys-tem [11]. The first operation is fuzzification, which is the mapping from a crisppoint to a fuzzy set. The second operation is inferencing, which is the evaluation ofthe fuzzy rules in the form of IF-THEN. The last operation is defuzzification, whichmaps the fuzzy output of the expert system into a crisp value, as shown in Fig. 5.This figure also represents a definition of fuzzy control system.

In the FRB classification, the ranges of input variables are divided into subregionsin advance and each fuzzy rule is defined for each grid. A set of fuzzy rules is definedfor each class. The membership degree of an unknown input for each class is calcu-lated and the input is classified into the class associated with the highest membershipdegree. FUzzy Logic Development Kit, FULDEK [22] is used to generate the fuzzyrules from data points. In the FULDEK, each input variable is divided evenly andthe output variable is represented by a singular function. More subregions (up to125) or more rules (up to 400) are required to map the input and output data points,with lower error without a pruning process.

3.3 Fuzzy Neural Network (FNN) Classifiers

Fuzzy Neural Network (FNN) is a connectionist model for fuzzy rules implemen-tation and inference, in which fuzzy rule prototypes are imbedded in a generalizedneural network, and is trained using numerical data. There are a wide variety ofarchitectures and functionalities of FNN. They differ in the type of fuzzy rules,type of inference method, and mode of operation. A general architecture of a FNNconsists of five layers. Figure 6 depicts a FNN for two exemplar fuzzy rules ofTakagi and Sugeno’s type, whose consequent of each rule is a linear composition ofinputs:

Human ExpertInput

Observations Output

FuzzificationInference Engine Defuzzification

(Fuzzy)(Fuzzy)(crisp)

InputApproximate

Output

(Crisp)

...

Rule SetData Set

Fig. 5 Definition of a fuzzy expert system

Page 343: Forging New Frontiers: Fuzzy Pioneers I

340 Y. Wang et al.

x1

x2

A1

A2

B1

B2

R1

R2

FuzzificationLayer

RuleLayer

ActionLayer

OutputLayer

InputLayer

o

x1 x2

x1 x2

Fig. 6 Fuzzy neural network (FNN) structure

Rule 1 : If x1 is A1 and x2 is B1, then f1 = p11x1 + p12x2 + r1. (2)

Rule 2 : If x1 is A2 and x2 is B2, then f2 = p21x1 + p22x2 + r2.

where Ai , Bi (i = 1, 2) are fuzzy linguistic membership functions and pi j , ri areparameters in the consequent fi of Rule i . It consists of the following layers: aninput layer, where the neurons represent input variables x1 and x2; a fuzzificationlayer, where the neurons represent fuzzy values μAi (x1) and μBi(x2), the mem-bership degree of x1 and x2; a rule layer, where the neurons represent fuzzy ruleswith a normalized firing strength wi , the firing strength wi can be the product ofmembership degree, μAi (x1)×μBi (x2); an action layer, where the neurons representweighted consequent of rules wi fi (i = 1, 2); and an output layer, where the neuron

represents output variable o =2∑

i=1wi fi =

∑2i=1 wi fi∑2

i=1 wi.

The ANFIS (Adaptive-Neural-Network Based Fuzzy Inference System) [23] isconsidered as the fuzzy neural network classifier, which is able to build up fuzzyrules from data points, and zeroth or first order Sugeno-type fuzzy inference is pro-vided. A gradient descent learning algorithm or in combination with least squaresestimate (a hybrid learning) is available to adjust the parameters in (2).

4 Hierarchical Structure

In section II, we have defined the nine land cover classes to be separated. Normally,it is hard to achieve good performance with only two or three input bands, especiallyfor the similar classes without big differentiation in most spectral bands. To discrim-inate the similar classes, we have introduced the two non-spectral bands, NDVI andDEM. For the fuzzy classifiers discussed in Section III, assume that an input band ispartitioned by m membership functions, then n input bands will consists of nm × 9rules for the classification, so the classification complexity increases exponentiallywith the raising number of input bands. With a high number of rules, the time

Page 344: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 341

required for the classification would be unreasonable. A hierarchical structure isproposed to solve the problem.

In the hierarchical structure, any classifier discussed in section III could be used.The hierarchical structure simplifies multi-class one-level classification to graduallevels of binary class classification. In the first level, all the land cover classes areseparated to two groups. For each group, it is further divided to sub-groups in thenext level through binary classification until there is only one land cover class ex-isting in the group. For the group classification, additional input bands are selectedbased on observing the signature data and consulting geographical experts. For thesignature data in Fig. 3, we have found that band TM5 and TM7 are effective to sep-arate the nine land cover classes to two groups. In Fig. 7 (a), the upper group consists

FNN

FNN

FNN

TM8

TM3

TM8

TM5

TM7

WT, UI,IV, BQ

BR, CB,SB, NG, JS

FNNIV, BQ

WT, UI

IV

BQTM9

FNN

TM10 BR

CB

FNN

TM1

SB, NG, JS

BR, CB

FNNTM7

JS

SB, NG SB

NG

FNN

WT

UITM1

TM10

(b)

(a)

FNN

WT

FNN

JS

TM1, TM3, TM5

TM7, TM8, TM9, TM10

TM1, TM3, TM5

TM7, TM8, TM9, TM10

Fig. 7 Classification structure of (a) HFNN (b) FNN (WT: Water, UI: Urban Imperious, IV: Ir-rigated Vegetation, BR: Barren, CB: Caliche-Barren, BQ: Bosque, SB: Shrubland, NG: NaturalGrassland, JS: Juniper Savanna)

Page 345: Forging New Frontiers: Fuzzy Pioneers I

342 Y. Wang et al.

of water, urban impervious, irrigated vegetation, and bosque, with lower brightnessvalues in band TM5 and TM7; the lower group consists of barren, caliche-barren,shrubland, natural grassland, and juniper savanna, with higher brightness values inTM5 and TM7. Similarly, the upper and lower group is further separated to twogroups with the additional input bands assisted until the upper and lower groupis finalized with one land cover class. For each group classification, two or threeinput bands are enough to achieve good performance regardless of the total numberof input bands. By using the hierarchical structure, the classification complexityis not affected by the increasing number of input bands. The classification struc-ture with Hierarchical Fuzzy Neural Network (HFNN) is compared to the one withFuzzy Neural Network (FNN) in Fig. 7. The hierarchical structure is not unique andflexible. Once there is any change in the input bands, we only need to modify thecurrent levels with the changed bands and their posterior levels, without changingtheir previous levels.

Totally, five spectral bands (TM1, TM3, TM5, TM7, TM8) and two non-spectralbands (TM9, TM10) are applied on the classification. If normal type rules were usedblindly the number of rules needed for the classification would be extremely high.Assuming each input is partitioned using two membership functions the number ofrules needed would be 27 × 9 = 1152. However, the hierarchical structure withthe same membership functions consists of eight-group classification, six with twoinputs and two with three inputs, so together 22 × 6 + 23 × 2 = 40 rules areused totally. The number of rules is reduced tremendously by using the hierarchicalclassification, and it saves the computation time a lot.

5 Classification Results

5.1 Fuzzy Classification

In this experiment, we will illustrate the fuzzy classification results of the FuzzyRule-Based (FRB) and Fuzzy Neural Network (FNN) classifier on the ETM+ im-age, which are compared with those of the Maximum Likelihood Classifier (MLC)and Back-Propagation Neural Network (BPNN) classification. First, three inputbands TM1, TM4 and TM7 are recommended for the classification by geographicalexperts, which is one of the most effective combinations to discriminate the landcover classes.

In the MLC classification, the mean and covariance matrix of each category areestimated from the training data and each pixel is classified to the category with themaximum posterior probability. The accuracy of the water and bosque class is zeroand the overall and average accuracy is only 64% and 47%, respectively (Table 3),not acceptable for most land cover classification.

In the BPNN classification, one hidden layer with 5, 10, and 15 hidden nodesand two hidden layers with 5, 5 hidden nodes are tested with 1000 and 10000training epochs respectively on the ETM+ image. Each band is first normalized

Page 346: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 343

Tab

le3

Cla

ssifi

cati

onac

cura

cy(%

)co

mpa

riso

n(B

PNN

3109

,100

00re

pres

ents

the

BPN

Nw

ith

3in

putn

odes

,10

hide

nno

des,

9ou

tput

sno

des,

and

trai

ned

wit

h10

000

epoc

hs;t

hebo

lded

colo

umns

are

for

com

pari

son)

Cla

sses

ML

CB

PNN

359

1000

BPN

N35

910

000

BPN

N31

0910

00

BPN

N31

0910

000

BPN

N31

5910

00

BPN

N31

5910

000

BPN

N35

5910

00

BPN

N35

5910

000

FRB

FNN

Wat

er0

100

100

100

100

100

100

010

010

010

0U

rban

Impe

rvio

us99

.78

99.2

899

.43

99.2

198

.35

99.4

398

.13

99.0

799

.07

96.7

99Ir

riga

ted

Veg

etat

ion

73.5

497

.595

.01

97.3

497

.597

.84

98.1

798

.598

97.3

497

.5B

arre

n56

.41

93.3

691

.81

89.9

893

.07

90.4

992

.989

.69

92.1

93.3

690

.66

Cal

iche

-Bar

ren

87.1

40

00

00

00

00

0B

osqu

e0

71.6

210

089

.19

95.9

589

.19

89.1

90

100

100

100

Shru

blan

d36

.267

.55

63.5

862

.91

74.6

156

.29

71.9

699

.34

64.6

866

.89

64.4

6N

atur

alG

rass

land

53.6

10

28.9

50.9

533

.08

59.3

20

059

.32

4.94

47.1

5Ju

nipe

rSa

vann

a20

.73

08.

5414

.02

14.6

37.

3214

.02

00

5.49

22.5

6O

vera

ll63

.584

.185

.14

85.8

386

.92

85.7

584

.81

78.7

186

.984

.16

86.4

Ave

rage

47.4

958

.81

65.2

567

.07

67.4

766

.65

62.7

142

.96

68.1

362

.75

69.0

4K

appa

Coe

ffici

ent

54.2

179

.19

80.7

481

.74

8381

.62

80.1

972

.32

83.0

679

.48

82.4

7

Page 347: Forging New Frontiers: Fuzzy Pioneers I

344 Y. Wang et al.

to be in [0, 1]. The output nodes are coded as each node representing a class. Thetraining of BPNN adaptively adjusts the learning rate and the momentum. The clas-sification accuracy is listed in Table 3. The eight BPNNs with different structureor training parameters result in the different accuracy of individual classes, suchas, the water, bosque, natural grassland, and juniper savanna class. However, theirvariation does not affect the zero accuracy of the caliche-barren class, most of whichis misclassified to the barren class. The accuracy of the natural grassland, and ju-niper savanna class is 0–59% and 0–15% respectively, even lower than that of theMLC classification 54% and 21% respectively, due to their mix and the mix withthe shrubland class. Even though, the overall and average accuracy of the BPNNclassification, 79%–87% and 43%–68% respectively, is generally higher than thatof the MLC classification, and with a higher kappa coefficient, 72%–83%. TheBPNN (3109, 10000) classification, with 3 input nodes, 10 hidden nodes, 9 outputsnodes, and trained with 10000 epochs, is the representative used to compare to otherclassifiers.

In the FRB classification [24, 25], band TM1, TM4, and TM7 is normalized to bein [0, 1] as done in the BPNN classification, then evenly represented by three trianglemembership functions, Small, Medium and Large shown in Fig. 8 (a). There are 27(33) rules generated for each class by combining different fuzzified input variablesso together 27× 9 or 243 rules used in the classification. The rule base, such as forthe water class, is following:

IF TM1 is Small and TM4 is Small and TM7 is Small THEN water is S1;IF TM1 is Small and TM4 is Small and TM7 is Medium THEN water is S2;. . ..IF TM1 is Big and TM4 is Big and TM7 is Medium THEN wateris S26;IF TM1 is Big and TM4 is Big and TM7 is Big THEN water is S27.

where, S1, . . ., S27 are singular membership functions for the water class (Table 4);multiplication is used for evaluating and of the antecedents; all the rules are com-bined by a maximum operation. The maximum output of all the rules represents themembership degree of a pixel belonging to the water class.

The overall and average accuracy in Table 3 of the FRB classification is degradedby 3% and 4% respectively, and the kappa coeffiecient is degraded by 3%, comparedto that of the BPNN classification. The accuracy of the natural grassland and junipersavanna class is much lower, 5% and 6% respectively.

To improve the individual class accuracy of the FRB classifier, we have appliedthe FNN classification Similary, three triangle membership fucntions are used torepresent each input variable so 27 rules are generated for each class and total243 rules are used in the classification. However, the input membership functionshave been changed to adapt to the training data with the hybrid learning algo-rithm, displayed in Fig. 8 (b), (c), (d). The rule base for the water class is modifiedto be:

IF TM1 is TM1Small and TM4 is TM4Small and TM7 is TM7Small THEN wateris S1;

Page 348: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 345

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

TM1(TM4, TM7)

pihsrebme

mf oeerg e

D

Small Medium Big

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

TM1

pihsre

bme

mfo

eerge

D

TM1Small TM1Medium TM1Big

(c)

(a)

(d)

(b)

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

TM4

pihsre

bme

m fo

eerg e

D

TM4Small TM4Medium TM4Big

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

TM7

pihsre

bme

mfo

eerge

D

TM7Small TM7Medium TM7Big

Fig. 8 Input membership functions (a) TM1, TM4, TM7 in the FRB classification (b) TM1 in theFNN classification (c) TM4 in the FNN classification (d) TM7 in the FNN classification

IF TM1 is TM1Small and TM4 is TM4Small and TM7 is TM7Medium THENwater is S2;. . ..IF TM1 is TM1Big and TM4 is TM4Big and TM7 is TM7Medium THEN water isS26;

where S1, . . ., S27 are the constants for the water class (Table 5) and the otheroptions are same as those in the FRB classification. If the maximum output of all therules is less than zero or bigger than one, it is regarded as zero or one, respectively.

The natural grassland and juniper savanna classes are discriminated with a betteraccuracy of 47% and 23%, respectively (Table 3). The overall accuracy is 86%,

Table 4 Singular membership functions for the water class of the FRB classification

S1: 0.749 S5: 0.310 S9: 0.5 S13: 0.294 S17: 0.297 S21: 0.498 S25: 0.261S2: 0.141 S6: 0.497 S10: 0.04 S14: 0.217 S18: 0.223 S22: 0.224 S26: 0.227S3: 0.499 S7: 0.238 S11: 0.259 S15: 0.215 S19: 0.217 S23: 0.237 S27: 0.235S4: 0 S8: 0.285 S12: 0.508 S16: 0.297 S20: 0.245 S24: 0.224

Page 349: Forging New Frontiers: Fuzzy Pioneers I

346 Y. Wang et al.

Table 5 Constant outputs for the water class in of the FNN classification

S1: 0.999 S5: −0.001 S9: 0.002 S13: −0.001 S17: 0 S21: 0 S25: 0S2: −0.222 S6: 0.006 S10: −0.349 S14: 0 S18: 0 S22: 0 S26: 0S3: 3.800 S7: 0.001 S11: 0.093 S15: 0 S19: −0.022 S23: 0 S27: 0S4: 0.015 S8: −0.010 S12: −1.588 S16: −0.003 S20: 0.043 S24: 0

similar to that of the BPNN classification. The average accuracy is 69%, better thanthat of the BPNN and FRB classification, 67% and 62% respectively.

The classification maps of the four classifiers are shown in Fig. 9 with each classdisplayed by an individual color. In the MLC classification map, all the water andbosque areas are misclassified as the urban impervious. The maps of the BPNNand FNN classifier are similar and different from that of the FRB classifier, wherethe natural grassland and juniper savanna are less showed up but covered by theshrubland class, consistent with the previous analysis.

(a) (b)

(c) (d)

Water Urban Impervious Irrigated Vegetation Barren

Caliche-Barren Bosque Shrubland Natural Grassland

Juniper Savanna

Fig. 9 Classification map of (a) MLC (b) BPNN (c) FRB (d) FNN

Page 350: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 347

5.2 Hierarchical Classification

In this experiment we will implement the hierarchical classification in Fig. 7(a).In the Hierarchical Fuzzy Neural Network (HFNN) classification, each group clas-sification is composed of the FNN classifier and the FNN outputs in the previouslevel are fed to the group classification in the next level. The prior classificationaffects its posterior classification and the final classification results. For each FNNclassifier, the input variable is represented by two Gaussian combination member-ship functions. Similarly, the FNN classification without the hierarchical structurein Fig. 7 (b) is implemented with the seven input bands and with the same mem-bership functions. Their error matrices are listed in Table 6 and 7, respectively. Theoverall accuracy, average accuracy, and kappa coefficient of the HFNN classification

Table 6 Error matrix of the HFNN classification (WT: Water, UI: Urban Imperious, IV: IrrigatedVegetation, BR: Barren, CB: Caliche-Barren, BQ: Bosque, SB: Shrubland, NG: Natural Grassland,JS: Juniper Savanna)

ActualClasses

Predicted Classes Accuracy (%)

WT UI IV BR CB BQ SB NG JP

WT 303 0 0 0 0 0 0 0 0 100UI 14 1367 0 12 0 0 0 0 1 98.06IV 8 16 577 0 0 0 0 0 0 96.01BR 0 46 0 1491 53 0 8 37 111 85.4CB 0 0 0 0 70 0 0 0 0 100BQ 0 0 0 0 0 74 0 0 0 100SB 0 4 0 0 0 0 439 0 10 96.91NG 0 20 0 0 0 0 168 42 33 15.97JP 0 0 0 2 0 0 0 0 162 98.78

Overall Accuracy (%) = 89.29, Average accuracy (%) =87.9, Kappa Coefficient (%) = 86.39, CPU Time = 223s

Table 7 Error matrix of the FNN classification (WT: Water, UI: Urban Imperious, IV: IrrigatedVegetation, BR: Barren, CB: Caliche-Barren, BQ: Bosque, SB: Shrubland, NG: Natural Grassland,JS: Juniper Savanna)

ActualClasses

Predicted Classes Accuracy (%)

WT UI IV BR CB BQ SB NG JP

WT 303 0 0 0 0 0 0 0 0 100UI 1 1245 1 36 0 0 1 1 109 89.31IV 2 79 496 0 0 0 24 0 0 82.53BR 0 73 2 1527 0 0 2 27 115 87.46CB 0 0 0 2 68 0 0 0 0 97.14BQ 0 0 0 0 0 74 0 0 0 100SB 0 17 0 1 0 0 108 327 0 23.84NG 0 0 0 0 0 0 103 127 33 48.29JP 0 0 0 44 2 0 57 0 61 37.2

Overall Accuracy (%) = 79.1, Average accuracy (%) =73.97, Kappa Coefficient (%) = 73.41, CPU Time = 10070s

Page 351: Forging New Frontiers: Fuzzy Pioneers I

348 Y. Wang et al.

(a) (b)

Water Urban Impervious Irrigated Vegetation Barren

Caliche-Barren Bosque Shrubland Natural Grassland

Juniper Savanna

Fig. 10 Classification map of (a) HFNN (b) FNN

is 10%, 14%, and 12% respectively higher than that of the FNN classification. TheHFNN classification runs 223s (3.7min) on a Pentium IV 2.2GHz computer. How-ever, the FNN classification costs 10070s (2.8hr) on the same computer, around 45times of the HFNN computation time. Even though the input membership functionsare changed with different numbers or types then the accuracy difference betweenthe HFNN and FNN classification is varied, the HFNN classifier always gains betterclassification accuracy in shorter CPU time than the FNN classifier with the sameparameters. The classification maps of the HFNN and FNN classifier are comparedin Fig. 10.

Compared to the FNN classifier with the three input bands, the HFNN classifierimproves the overall accuracy and average accuracy with 3% and 9% respectively,with a 4% higher kappa coefficient. Specially, the caliche-barren class is separatedfrom the barren class with an accuracy of 100% and the classification of shrublandand juniper savanna classes are much improved with an accuracy of 97% and 99%respectively. The natural grassland class is not classified well with an accuracy of16%, and its 64% areas are misclassified to the shrubland. However, the FNN clas-sifier with the seven input bands degrades the overall accuracy and kappa coefficientwith 7% and 9% respectively, but improves the average accuracy with 5%, since theinput variable is represent by less number of membership functions.

6 Conclusions

In this paper, we discussed the Neural Network (NN), Fuzzy Rule-Based (FRB),and Fuzzy Neural Network (FNN) classifier, which were compared and tested onthe Landsat 7 ETM+ image over Rio Rancho, NM. The FRB and FNN classifierwere represented by IF-THEN rules, interpretable by humans and easily to incor-porate human experiences. But the FRB classifier did not achieve the classificationaccuracy as good as that of the NN classifier because of limited expert knowledge

Page 352: Forging New Frontiers: Fuzzy Pioneers I

Hierarchical Fuzzy Classification of Remote Sensing Data 349

or rule base. The FNN classifier improved the classification accuracy of some cat-egories by combing the learning capability of neural networks. Even though wehave applied different classification methods to improve the classification accuracy,some similar classes could still not be separated or only be separated with very lowaccuracy.

To discriminate the similar classes, we have introduced some non-spectral bands,which on the other hand would increase the classification complexity and the com-putation time. High input dimension could cause the computation time too longto be unacceptable. The hierarchical structure was proposed to simplify the multi-class classification to multiple binary class classification so that the classificationcomplexity was not increased even with more number of input bands. By using thehierarchical structure, we can gain better classification accuracy in shorter compu-tation time. It was showed in the experiment that the Hierarchical Fuzzy NeuralNetwork (HFNN) classifier achieved the best overall accuracy (89%) and averageaccuracy (88%) in shorter CPU running time (3.7min) with only 40 fuzzy rules.Especially, the similar clasesse, such as the caliche-barren, shrubland and junipersavanna classes, were seperated with higher accuracy.

In the fuzzy classification, we only used linear rules and linear outputs for theclassification. Nonlinear rules or nonlinear outputs, such as the extension of ANFIS,Coactive Neuro-Fuzzy Modeling CANFIS [26], could be considered to improve theclassification accuracy, especially that of the natural grassland class, which was stilllow. An appropriate hierarchical structure was necessary to gain good classifica-tion performance but it was difficult to generate an optimal one by observing thedata statistics. In the future work, we could consider implementing the hierarchicalstructure automatically with genetic algorithms.

Acknowledgments The authors would like to thank anonymous reviewers for their comments,the staff at the Earth Data Analysis Center (EDAC) at UNM for providing the Landsat 7 data setand expert knowledge, the professors and students at the Autonomous Control Engineering (ACE)Center at UNM for their support.

References

1. Heermann, P. D., and N. Khazenic, 1992. “ Classification of Multispectral Remote SensingData using a Back-Propagation Neural Network”, IEEE Trans. Geosci. and Remote Sens.,30(1): 81–88.

2. Foody, G. M., 1992. “A fuzzy sets approach to the representation of vegetation continua fromremotely sensed data: An example from lowland heath”, Photogramm. Eng. & Remote Sens.,58(2): 221–225.

3. Foody, G. M., 1999. “The Continuum of Classification Fuzziness in Thematic Mapping,” Pho-togramm. Eng. Remote Sens., 65(4): 443–451.

4. Jeansoulin, R., Y. Fonyaine, and W. Fri, 1981. “Multitemporal Segmentation by Meansof Fuzzy Sets,” Proc. Machine Processing of Remotely Sensed Data Symp., 1981, pp.336–339.

5. Kent, J. T., and K. V. Mardia, 1988. “Spatial Classification Using Fuzzy Member-Ship Mod-els,” IEEE Trans. Pattern Anal. Machine Intel., 10(5): 659–671.

Page 353: Forging New Frontiers: Fuzzy Pioneers I

350 Y. Wang et al.

6. Cannon, R. L., J. V. Dave, J. C. Bezdek, and M. M. Trivedi, 1986. “Segmentation of A The-matic Mapper Image Using The Fuzzy C-Means Clustering Algorithm,” IEEE Trans. Geosci.and Remote Sensing, GRS-24(3): 400–408.

7. Yu, Shixin, De Backer, S., and Scheunders, P., 2000 “Genetic feature selection combined withfuzzy kNN for hyperspectral satellite imagery”, Proc. IEEE International Geosci. and RemoteSensing Symposium (IGARSS’00), July 2000, vol. 2, pp. 702–704.

8. Wang, F., 1990. “Improving Remote Sensing Image Analysis Through Fuzzy Informationrepresentation,” Photogramm. Eng. & Remote Sens., 56(8): 1163–1169.

9. Bárdossy, A., and L. Samaniego, 2002. “Fuzzy Rule-Based Classification of Remotely SensedImagery”, IEEE Trans. Geosci. and Remote Sens., 40(2): 362–374.

10. Melgani, F. A., B. A. Hashemy, and S. M. Taha, 2000. “An explicit fuzzy supervised classi-fication method for multispectral remote sensing images”, IEEE Trans. Geosci. and RemoteSens., 38(1): 287–295.

11. Jamshidi, M., 1997. Large-Scale Systems: Modeling, control, and fuzzy logic, Prentice HallInc..

12. Foody, G. M., 1996. “Relating the land-cover composition of mixed pixels to artificial neuralnetwork classification output”, Photogramm. Eng. & Remote Sens., 62: 491–499.

13. Warner, T. A., and M. Shank, 1997. “An evaluation of the potential for fuzzy classification ofmultispectral data using artificial neural networks”, Photogramm. Eng .& Remote Sens., 63:1285–1294.

14. Tzeng, Y. C. and K. S. Chen, 1998. “A Fuzzy Neural Network to SAR Image Classification”,IEEE Trans. Geosci. and Remote Sens., 36(1).

15. Sun, C. Y., C. M. U. Neale, and H. D. Cheng, 1998. “A Neuro-Fuzzy Approach for MonitoringGlobal Snow and Ice Extent with the SSM/I”, Proc. IEEE International Geosci. and RemoteSensing Symposium (IGARSS’98), July 1998, vol. 3, pp. 1274–1276,.

16. Chen, L., D. H. Cooley, and J. P. Zhang, 1999. “Possibility-Based Fuzzy Neural Networks andTheir Application to Image Processing”, IEEE Trans. Systems, Man, and Cybernetics-Part B:Cybernetics, 29(1): 119–126.

17. Bachmann, C., T. Donato, G. M. Lamela, W. J. Rhea, M. H. Bettenhausen, R. A. Fusina, K. R.Du Bois, J. H. Porter, and B. R. Truitt, 2002. “Automatic classification of land cover on SmithIsland, VA, using HyMAP imagery,” IEEE Trans. Geosci. and Remote Sens., 40: 2313–2330.

18. Dell’Acqua, F., and Gamba, P., 2003. “Using image magnification techniques to improveclassification of hyperspectral data”, Proc. IEEE International Geosci. and Remote SensingSymposium (IGARSS ’03), July 2003, vol. 2, pp. 737–739.

19. Bales, C. L. K., 2001. Modeling Feature extraction for erosion sensitivity using digital im-age processing, Master Thesis, Department of Geography, University of New Mexico, Albu-querque, New Mexico.

20. ERDAS, 1997. ERDAS Field Guide, Fourth Edition, 1997, ERDAS@, Inc. Atlanta, Georgia.21. Duda, R. O., P. E. Hart, and D. G. Stork, 2001. Pattern Classification, Jone Wiley & Sons

(Asia) Pte. Ltd., New York.22. Dreier, M. E., 1991, FULDEK-Fuzzy Logic Development Kit, TSI Press, Albuquerque, NM.23. Jang, J. S. R., 1993. “ANFIS: Adaptive-Network-Based Fuzzy Inference System”, IEEE Trans.

Systems, Man, and Cybernetics, 23(3): 665–685.24. Wang, Y., and M. Jamshidi, 2004. “Fuzzy Logic Applied in Remote Sensing Image Clas-

sification”, IEEE Conference on Systems, Man, and Cybernetics, 10-13 Oct. 2004, Hague,Netherlands, pp. 6378–6382.

25. Wang, Y., and M. Jamshidi, 2004. “Multispectral Landsat Image Classification Using FuzzyExpert Systems”, World Automation Congress, June 28 - July 1, 2004, Seville, Spain, in Im-age Processing, Biomedicine, Multimedia, Financial Engineering and Manufacturing, vol. 18,pp. 21–26, TSI press, Albuquerque, NM, USA.

26. Jang, J.-S. R., C.-T. Sun, and E. Mizutani, 1997. Neuro-Fuzzy and Soft Computing, PrenticeHall, New Jersey.

Page 354: Forging New Frontiers: Fuzzy Pioneers I

Real World Applications of a Fuzzy DecisionModel Based on Relationships between Goals(DMRG)

Rudolf Felix

Abstract Real world applications of a decision model of relationships betweengoals based on fuzzy relations (DMRG) are presented. In contrast to other ap-proaches the relationships between decision goals or criteria for each decision sit-uation are represented and calculated explicitly. The application fields are decisionmaking for financial services, optimization of production sequences in car manufac-turing and vision systems for quality inspection and object recognition.

1 Introduction

When human decision makers deal with complex decision situations, they intuitivelydeal with relationships between decision goals or decision criteria and they reasonwhy the relationships between the goals do exist (Felix 1991). As discussed be-fore, for instance in (Felix 1995), other decision making models are not flexibleenough and do not reflect the tension between interacting goals in a way humandecision makers do. In contrast to this, the decision making model underlying thereal world applications presented in this paper, meets the required flexibility (Felixet al. 1996), (Felix 2001), (Felix 2003). The key issue of this model called DMRG(“decision making based on relationships between goals”) is the formal definition ofboth positive and negative relationships between decision goals. Using the model inapplications, the relationships are calculated driven by the current decision data. Inthis sense they are made explicit in every decision situation. After each calculation,the obtained information about the relationships between the goals together withthe priorities of the goals are used in order to select the appropriate aggregationstrategies. Finally, based on the selected decision strategies adequate decisions arefound.

Rudolf FelixFLS Fuzzy Logik Systeme GmbHJoseph-von-Fraunhofer Straße 20, 44 227 Dortmund, Germanye-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 351C© Springer 2007

Page 355: Forging New Frontiers: Fuzzy Pioneers I

352 R. Felix

2 Basic Definitions

Before we define relationships between goals, we introduce the notion of the posi-tive impact set and the negative impact set of a goal. A more detailed discussion canbe found in (Felix 1991, Felix et al. 1994).

Def. 1)Let A be a non-empty and finite set of potential alternatives, G a non-empty and

finite set of goals, A ∩ G = Ø, a ∈ A, g ∈ G, δ ∈ (0, 1]. For each goal g we definethe two fuzzy sets Sg and Dg each from A into [0,1] by:

1. positive impact function of the goal g

Sg(a) :={

δ, a affects positively g with degree δ

0, else

2. negative impact function of the goal g

Def. 2)

Dg(a) :={

δ, a affects negatively g with degree δ

0, else

Let Sg and Dg be defined as in Def. 1). Sg is called the positive impact set of g andDg the negative impact set of g.

The set Sg contains alternatives with a positive impact on the goal g and δ isthe degree of the positive impact. The set Dg contains alternatives with a negativeimpact on the goal g and δ is the degree of the negative impact.

Def. 3)Let A be a finite non-empty set of alternatives. Let P(A) be the set of all fuzzy

subsets of A. Let X, Y ∈ P(A), x and y the membership functions of X and Yrespectively. The fuzzy inclusion I is defined as follows:

I : P(A)× P(A)→ [0, 1]

I(X, Y ) =:

⎧⎪⎪⎨⎪⎪⎩

∑a∈A

min(x(a), y(a))

∑a∈A

x(a), for X �= Ø with x(a) ∈ X

1 , for X = Ø

and y(a) ∈ Y.

The fuzzy non-inclusion N is defined as:

N : P(A)× P(A)→ [0, 1]

N(X, Y ) := 1− I(X, Y )

The inclusions indicate the existence of relationships between two goals. The higherthe degree of inclusion between the positive impact sets of two goals, the more

Page 356: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 353

cooperative the interaction between them. The higher the degree of inclusion be-tween the positive impact set of one goal and the negative impact set of the second,the more competitive the interaction. The non-inclusions are evaluated in a similarway. The higher the degree of non-inclusion between the positive impact sets of twogoals, the less cooperative the interaction between them. The higher the degree ofnon-inclusion between the positive impact set of one goal and the negative impactset of the second, the less competitive the relationship.

Note that the pair (Sg , Dg) represents the entire known impact of alternatives onthe goal g. (Dubois and Prade, 1992) shows that for (Sg , Dg) the so-called twofoldfuzzy sets can be taken. Then Sg is the set of alternatives which more or less cer-tainly satisfy the goal g. Dg is the fuzzy set of alternatives which are rather lesspossible, tolerable according to the decision maker.

3 Definition of Relationships Between Goals

Based on the inclusion and non-inclusion defined above, 8 basic types of relation-ships between goals are defined. The relationships cover the whole spectrum froma very high confluence (analogy) between goals to a strict competition (trade-off).Independence of goals and the case of unspecified dependence are also considered.

Def. 4)Let Sg1, D1, Sg2 and Dg2 be fuzzy sets given by the corresponding membership

functions as defined in Def. 2). For simplicity we write S1 instead of Sg1 etc. Letg1, g2 ∈ G where G is a set of goals.

The relationships between two goals are defined as fuzzy subsets of G × G asfollows:

1. g1 is independent of g2:⇐I S − I N DE P E N DE N T − O F(g1, g2) :=min (N(S1, S2), N(S1, D2), N(S2, D1) N(D1, D2)),

2. g1 assists g2 :⇐ASSI ST S(g1, g2) := min(I(S1S2), N(S1, D2))

3. g1 cooperates with g2 :⇐C O O P E R AT E S −W I T H (g1, g2) := min(I(S1, S2), N(S1, D2), N(S2, D1))

4. g1 is analogous to g2 :⇐I S − AN AL OGOU S − T O(g1, g2) :=min(I(S1, S2), N(S1, D2), N(S2, D1), I(D1, D2))

5. g1 hinders g2 :⇐H I N DE RS(g1, g2) := min(N(S1, S2), I(S1, D2))

6. g1 competes with g2 :⇐C O M P ET E S −W I T H (g1, g2) := min(N(S1, S2), I(S1, D2), I(S2, D1))

7. g1 is in trade-off to g2 :⇐I S_I N_T R ADE_O F F(g1, g2) :=min(N(S1, S2), I(S1, D2), I(S2, D1), N(D1, D2))

Page 357: Forging New Frontiers: Fuzzy Pioneers I

354 R. Felix

ANALOGOUS

COOPERATION

ASSISTS

TRADE OFF

COMPETE

HINDERS

Fig. 1 Subsumption of relationships between goals

8. g1 is unspecified dependent from g2 :⇐I S −U N S P EC I F I E D − DE P E N DE NT − F RO M(g1, g2) :=min(I(S1, S2), I(S1, D2), I(S2, D1), I(D1, D2))

The relationships 2, 3 and 4 are called positive relationships. The relationships 5, 6and 7 are called negative relationships.

Note: The relationships between goals have a subsumption relation in the senseof Fig. 1.

Furthermore, there are the following duality relations: assists↔ hinders, coop-erates↔ competes, analogous↔ trade off, which correspond to the common senseunderstanding of relationships between goals.

The relationships between goals turned out to be crucial for an adequate mod-eling of the decision making process (Felix 1994) because they reflect the way thegoals depend on each other and describe the pros and cons of the decision situation.Together with information about goal priorities, the relationships between goals areused as the basic aggregation guidelines in each decision situation: Cooperativegoals imply a conjunctive aggregation. If the goals are rather competitive, then anaggregation based on an exclusive disjunction is appropriate. In case of independentgoals a disjunctive aggregation is adequate.

4 Types of Relationships Between Goals Imply Wayof Aggregation

The observation, that cooperative goals imply conjunctive aggregation and conflict-ing goals rather lead to exclusive disjunctive aggregation, is easy to understand fromthe intuitive point of view.

The fact that the relationships between goals are defined as fuzzy relations basedon both positive and negative impacts of alternatives on the goals provides for theinformation about the confluence and competition bet-ween the goals.

Figure 2 shows two different representative situations which can be distinguishedappropriately only if besides the positive impact of decision alternatives additionallytheir negative impact on goals is considered.

Page 358: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 355

Fig. 2 Distinguishing independence and trade-off based on positive and negative impact functionsof goals

In case the goals were represented only by the positive impact of alternativeson them, situation A and situation B could not be distinguished and a disjunctiveaggregation would be considered as plausible in both cases.

However, in situation B a decision set S1 ∪ S2 would not be appropriate becauseof the conflicts indicated by C12 and C21. In this situation the set (S1/D2)∪(S2/D1)

could be recommended in case that the priorities of both goals are similar (whereX/Y is defined as the difference between the sets X and Y, that means X/Y = X∩Y ,where X and Y are fuzzy sets). In case that one of the two goals, for instance goal1, is significantly more important than the other one, the appropriate decision setwould be S1. The aggregation used in that case had not to be a disjunction, but anexclusive disjunction between S1 and S2 with emphasis on S1.

Page 359: Forging New Frontiers: Fuzzy Pioneers I

356 R. Felix

This very important aspect, can easily be integrated into decision making mod-els by analyzing the types of relationships between goals as defined in Def.4. Theinformation about the relationships between goals in connection with goal prioritiesis used in order to define relationship-dependent decision strategies which describethe relationship-dependent way of aggregation. For conflicting goals, for instance,the following decision stra-tegy which deduces the appropriate decision set is given:

If (g1 is in trade-off to g2) and (g1 is significantly more important than g2)then S1.

Another strategy for conflicting goals is the following:If (g1 is in trade-off to g2) and (g1 is insignificantly more important than g2) then

S1/D2.Note that both strategies use priority information. Especially in case of a con-

flictive relationship between goals priority information is crucial for an appropriateaggregation. Note also that the priority information can only be used adequately if inthe decision situation there is situation dependent explicit calculation about whichgoals are conflictive and which of them are rather confluent (Felix 2000).

5 Related Approaches, a Brief Comparison

Since fuzzy set theory has been suggested as a suitable conceptual frameworkof decision making (Bellmann and Zadeh 1970), two directions in the field offuzzy decision making are observed. The first direction reflects the fuzzification ofestablished approaches like linear programming or dynamic programming(Zimmermann 1991). The second direction is based on the assumption that theprocess of decision making is adequately modeled by axiomatically specified ag-gregation operators (Dubois and Prade 1984). None of the related approaches suf-ficiently addresses the crucial aspect of decision making, namely the explicit andnon-hard-wired modeling of the interaction between goals. Related approaches likethose described in (Dubois and Prade 1984), (Biswal 1992) either require a veryrestricted way of describing the goals or postulate that decision making shall beperformed based on rather formal characteristics of aggregation like commutativityor associativity. Other approaches are based on strictly fixed hierarchies of goals(Saaty 1992) or on the modeling of the decision situations as probabilistic or possi-bilistic graphs (Freeling 1984). In contrast to that, human decision makers usuallyproceed in a different way. They concentrate on the information which goals arepositively or negatively affected by which alternatives. Furthermore, they evaluatethis information in order to infer how the goals interact with each other and ask forthe actual priorities of the goals. In the sense that the decision making approach pre-sented in this contribution explicitly refers to the interaction between goals in termsof the defined relationships, it significantly differs from other related approaches(Felix 1995).

The subsequently presented real world applications demonstrate the practicalrelevance of DMRG. In Sects. 6 and 7 two applications in financial services aredescribed. Section 8 shows how DMRG is used for the optimization of sequenceswith which car manufacturers organize the production process. Finally, it is shown

Page 360: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 357

in Sect. 9 how the concept of impact sets is used for image representation and thatthe relationships between goals can be applied for calculating similarity of images.

6 Application to Calculating Limit Decisions for Factoring

Factoring is the business of purchasing the accounts receivable of a firm (client)before their due date, usually at a discount. The bank to which the accounts are soldis called factor. The factor has to decide what shall be the maximum acceptable limitof the accounts by evaluating the debitor of the client.

For making its limit decisions, the factor follows a set of criteria. The decisionalternatives consist of limit levels expressed, for instance, by percentages of theamount requested from the factor. Examples of decision criteria are explained indetail below.

6.1 Examples of Handling Decision Criteria

Let us explain some typical decision criteria:

1. Statistical Information Index SIThis index bears values of a specific SI-interval, ranging between SI1 and SI2.There is, however, a subinterval of [SI1, SI2] where the decision is made only ininteraction with other criteria.

2. Integration in the MarketThis value indicates in which way the company has been active in the market andwhat is the corresponding risk of default. Experience shows that the relation be-tween life span and risk of default is not inversely proportional, but that it showsa non-linear behavior, for instance depending on relations to other companies.

3. Public Information Gained from Mass Media, Phone Calls, Staff, etc.This kind of information has to be entered into the factoring system by means ofa special qualitatively densified data statements.

4. Finance InformationThis information concerns aspects like credit granting, utilization and over-drawing as well as receivables. The factor’s staff will express the informationobtained using qualitatively scaled values. The values depend on the experienceof interaction with other criteria.

6.2 Impact Sets

For expressing decision criteria, decision makers use linguistic statements. They say,for instance, that a particular criterion may have the values high, neutral or low, as

Page 361: Forging New Frontiers: Fuzzy Pioneers I

358 R. Felix

Table 1 Impact sets for the criterion C and its values

Limit granted in % of L High Neutral Low

100 % ++ 0 –75 % ++ + –50 % + ++ –25 % – + –0 % – – ++

presented in Table 1. The decision alternatives reflect a granularity of percentage ofthe limit requested by the client. For simplicity we choose here a granularity of 25%.That means, when the limit requested is L, the limit decision equals a percentage< 100% of the requested amount L. For each criterion C and its correspondingvalue the impact sets are defined according to Definition 1. For instance, for thevalue “high” of the criterion C, the entry “++” will be expressed, according to Def.1, as δ of SC high by a value close to 1. Analogously, for the value “low” of thecriterion C, the entry “– ” will get as δ of DC high a value close to 1. This is reflectedby the set SC high for positive and by the set DC high for negative impacts.

Since the decision maker is evaluating pros and cons, both positive and negativeimpacts must be expressed. Thus, both the positive and the negative impact setsSC high and DC high are needed.

6.3 Real World Application Environment

The factoring application presented in this section is part of an internet-based soft-ware system used by a factoring bank. The quality of the system’s recommendeddecisions is equal to the decision quality achieved by senior consultants of the bank.An important aspect of the system is its standardization of the general quality oflimit decisions. Using the system, a very high percentage of the limit decisions ismade automatically.

7 Application to Cross-Selling

The so called cross-selling is part of the business process that takes place duringevery assisting conversation when a bank’s customer applies for a credit. In casethe bank employee decides to grant the credit, the question arises whether the bankshould offer the customer additional products like, for instance, a life assurance,credit card, current account, etc. It is the task of the bank’s sales assistant to define aranking of the different products and then to select the product (decision alternative)which corresponds best to the customer’s profile. The customer profile is defined bya variety of criteria which build up the bank’s cross-selling policy. The task of cross-selling consists in checking up to which extent which of the customer characteristics

Page 362: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 359

(cross-selling signals) match the cross-selling criteria. Subsequently, the matchingresults are aggregated to the cross-selling product ranking. The cross-selling prod-uct ranking shows the bank’s sales assistant in which sequence the cross-sellingproducts are to be offered to the customer.

Let us express the decision making situation in terms of the decision makingmodel presented in Sect. 2:

The set A of decision alternatives is the set of cross-selling products, for instance:A: = {life assurance (LA), accident assurance (AA), credit card (CC), current ac-count (CA)}.

The set of decision goals G is the set of criteria under which the cross-sellingranking is performed, for instance: G: = {customer lives close to a branch (LCB),customer already has other assurance policies (OAP), customer is young-aged(CYA), customer is old-aged (COA)}.

7.1 Impact Sets

The impact the cross-selling products have on the criteria is defined by the positiveand negative impact functions (see Table 2). For simplicity, they are given in the fol-lowing so-called impact matrix, which subsumes the resulting positive and negativeimpact sets S and D for each goal.

The ranking is calculated based on the impact matrix, which is the result of aknowledge acquisition process with human expert decision makers. The rankingcalculation can be extended by defining priorities attached to the criteria. Such at-taching priorities enables the bank to define strategies that express on which criteriawhich emphasis should be put while calculating the ranking.

Please note that in real world cross-selling the number of cross-selling productsis about 10 and the number of goals (criteria) about 40. That means that a rule-based approach practically cannot be used for the representation of the cross-sellingknowledge, because there are approximately 240 · 210 dependencies between thecriteria (goals) and the decision alternatives (cross-selling products).

Using the decision model presented in Sect. 2, at the most 2 · 10 · 40 statementsare necessary. Generally speaking, using DMRG the complexity of the knowledgedescription and therefore the complexity of the knowledge acquisition process canbe decisively reduced and is estimated by O (m · n), where m is the cardinality of theset of alternatives A and n the cardinality of the set of goals (criteria) G. In contrast to

Table 2 Example of an Impact Matrix for Cross-Selling Decisions

Goals LCB OAP CYA COAAlternatives SLCB DLCB SOAP DOAP SCYA DCYA SCOA DCOA

LA Ø Ø 0.9 Ø 0.6 Ø Ø 0.6AA Ø Ø 0.6 Ø 0.9 Ø 0.6 ØCC Ø Ø Ø Ø Ø 0.6 0.5 ØCA 0.8 Ø Ø Ø 0.9 Ø 0.6 Ø

Page 363: Forging New Frontiers: Fuzzy Pioneers I

360 R. Felix

this, conventional (fuzzy) rule based approaches require a complexity of O (2m ·2n).The decisively lower complexity of the knowledge acquisition process is a key issuewhen introducing fuzzy decision making models into real world application fieldslike the one presented in this section.

7.2 Real World Application Environment

The cross-selling application presented in this section is part of a credit and cross-selling software system used by a private customer bank with branches in more than55 German cities and over 300 consultants using the application. The system hasbeen working for more than five years and is implemented in all direct creditingand cross-selling processes. The quality of the system’s recommended decisions isequal to the decision quality achieved by senior consultants of the bank. The systemprovides for a standardization of the cross-selling decision quality across the bank’sbranches.

8 Application to the Optimization of Production Sequencesin Car Manufacturing

The production process of cars is performed in various successive steps. In thefirst step, the car bodies are built and moved to a store before the painting area.Then, the car bodies are fetched from the store, painted in the so-called paint shopand placed in an another store before entering the assembly line. According to theproduction program, the painted car bodies are fetched from this store one by oneand transported to the assembly line, where modules such as undercarriage, motor,wheels, or – depending on the particular car – special equipment such as electricwindow lifters, sunroofs or air-conditioning systems, are assembled until the entirecar assembly process is finished.

The assembly is carried out at various assembly stations usually connected inseries, where specialized teams of workers perform the different assembly work.Depending on the particular car and its equipment, each car body that passes the as-sembly line creates a different workload at the individual assembly stations. Figure 3gives an overview of the structure of the assembly process. For simplicity, the storebetween the body works and the paint shop is not shown.

The sequence with which the car bodies are fetched from the store and enteredinto the assembly line is crucially important for both the degree of capacity uti-lization at the assembly stations and the smoothness of the assembly process. Itis finally extremely important for both the quality of the cars and adequate pro-duction costs. A good sequence is organized in such a way that the utilization ofcapacity is as constant as possible and as near to 100% as possible, in order toguarantee minimal possible costs of the assembly process with respect to the car tobe produced in a planning period. At the same time the sequence must meet a large

Page 364: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 361

high store

?body works paint shop

assembly area

assemblystation4061

assemblystation4062

assemblystation4064

assemblystation4065

Fig. 3 Assembly process of cars

number of technical restrictions which define which sequences are admissible interms of technical conditions, for instance, the availability of tools. These conditionsbasically describe which equipments can be assemblied after another and in whichcombination. Examples of such conditions are that for instance approximately onlyevery second car may have a sunroof (distribution goal) and exactly every third carmust be a car with an air conditioning system (set point distance goal) and sedansmust be sequenced in groups of four cars (group goal) etc. A particular car may have50 and more different relevant equipment items. To each equipment item a distri-bution goal, a group goal or a distance goal may be attached simultaneously. Thismeans that there are approximately 50 to 100 optimization goals and more to be metat the same time when calculating the sequence. Of course, the complexity of theoptimization problems is exponential in terms of the number of these optimizationgoals (approximately 50 to 100) and factorial in terms of the number of cars to besequenced.

8.2 Impact Sets

In the sense of DMRG the distribution goals, the distance goals, the group goals etc.are goals g1, . . . , gn with their positive and negative the impact sets Sg1, . . . , Sgnand Dg1, . . . , Dgn defined by the positive and negative impact functions. Let usdescribe how the impacts are defined based on an example of the optimization goalof the type ‘group’ (see Fig. 4). The value ‘Setpoint’ indicates the set point of thegroup. The values ‘Min’ and ‘Max’ indicate the minimum and the maximum lengthof the group respectively. In case of values that are smaller than ‘Min’ the positiveimpact is high. Between ‘Min’ and ‘Setpoint’ the positive impact decreases until0. Between ‘Setpoint’ and ‘Max’ the negative impact increases and from the value‘Max’ on the negative impact becomes the highest possible.

8.3 Calculation of a Sequence

The calculation of a sequence is performed iteratively. In every iteration step thecurrent set points are calculated based on the current production status and the

Page 365: Forging New Frontiers: Fuzzy Pioneers I

362 R. Felix

Fig. 4 Positive and Negative Impact Sets for an Optimization Goal of Type ‘group’

content of the store. For every optimization goal the impact sets are calculatedand the DMRG is invoked selecting the best car for the next position within thesequence. The procedure terminates if either all available cars have been sequencedor if there are conflicts in the requirement to the optimization goals which cannotbe resolved based on the content of the store. In such a case the current status of theconflicts and confluences between the goals is reported based on the relationshipsbetween the goals specified in Def. 4. The relationships between the optimizationgoals are then used to readjust the optimization goals.

8.4 Real World Application

The principle of the optimization presented in this section is used at more than20 different factories of more than 30 different car models. It is used for boththe pre-calculation of the sequences (planning mode) and the online-execution ofthe sequence during the production process (execution mode). The planning modeis used for calculating sequences of a package of cars scheduled for a period ofapproximately one day and is performed two or three days before the production.The execution mode is continuously optimizing the sequence during the assemblyprocess.

9 Application to Image Comparison and Recognition

In the field of image analysis fuzzy techniques are often used for extraction ofcharacteristics and for the detection of local image substructures like edges

Page 366: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 363

(Bezdek 1994). In these approaches the results of the analysis procedure often im-prove the visual quality of the image but do not lead to an automated evaluation ofthe content of the image in the sense of image comparison as required in the field ofindustrial quality inspection, for instance, where the membership to object classeshas to be recognized in order to make ok-or-not-ok decisions. Other approaches likethe so-called syntactic pattern recognition or utility functions (? , Thomason 1978)refer more directly to a comparison of images but the corresponding algorithms leadto a high computational complexity (polynomial or even exponential with respect tothe number of image characteristics) (Felix et al. 1994). What is required in orderto make a step forward in the field of image comparison and/or recognition is acomparison of the similarity of two or more images based on algorithms performingthe comparison and/or recognition in linear computation time.

9.1 Impact Sets and Image Representation

One attempt to meet this requirement is the following: We represent images basedon the concept of impact sets (Def.1). The impact sets are derived from the presenceand absence of elementary image characteristics in the particular image. Examplesof such elementary image characteristics are average grey levels or average greylevel gradients. The advantage of this image representation is that it can be derivedfrom any image without knowing the content of it. The representation of images interms of impact sets leads to the idea of considering the images as being decisiongoals in the sense of DMRG.

If we accept this idea then the positive and negative relationships as defined inDef.4 can be used as a tool for the comparison of images. Positive relationships in-dicate similarity, negative relationships indicate non-similarity of images. The finalrecognition decision is made by comparing the value of the relationships for instancewith empirically or statistically estimated (learned) threshold values. Another wayto generate the final recognition decision is the selection of images with the highestrelationship value compared to a reference image that stands for a class of imagesto be identified. More complex decision and aggregation strategies as used in otherapplications of DMRG are not needed.

In order to integrate location-oriented information in the method of representingimages by impact sets we partition the whole image by a grid structure of sectors.Then, for each sector the elementary characteristics (average grey level values andaverage grey level gradients) are calculated. In this way a vector of sector charac-teristics is calculated. Every component of the vector corresponds to one sector ofthe image and consists of values of the characteristics. To every sector an index isattached. The corresponding indices of each vector component with its characteris-tic indicate a location-oriented distribution of the characteristics within the image.Based on the index again impacts sets can be used in order to represent the imagewhere the index values are the elements and the characteristics of the sectors arethe membership values. Please note that the computational complexity of the rep-resentation, comparison and recognition of images based for instance on the notion

Page 367: Forging New Frontiers: Fuzzy Pioneers I

364 R. Felix

of analogy (Def. 4.4) is O(n m) where n is the number of image sectors and m thenumber of elementary characteristics under consideration.

9.2 Qualitative and Structural Image Representation, Comparisonand Recognition

The higher the number of sectors, the more details of the image are represented in alocation-oriented way. Therefore, with an increasing number of sectors the represen-tation of the image becomes increasingly structural. The lower the number of sectorsthe less structural information is considered and the more qualitative becomes therepresentation of the image. The term “qualitative” refers to image representation,which does not concern information about the location of the characteristics withinthe image at all: Qualitative analysis refers rather to the presence or absence ofimage characteristics, to their distribution within the image and to the intensityof the distribution. In contrast to this, structural image representation additionallytakes location oriented characteristics into account. In the extreme case of structuralanalysis, each sector corresponds to a single pixel. The extreme case of qualitativerepresentation is reached when the entire image is contained as the one and onlysector.

9.3 Real World Applications

Both the structural and the qualitative image analysis have been applied in a numberof industrial applications in the field of quality inspection and optical object recogni-tion. The problem independent representation of images opens a variety of possibleapplication fields. For example inspection of the quality of punched work pieces,quality analysis of casings, quality of robot controlled placement and positioningactions in the process automation (Felix et al. 1994) are examples of successfulapplications. Subsequently we describe the quality inspection of rubber surfacesand the optical recognition of tire profiles for the supervision of the tire flow in theprocess of tire production.

9.3.1 Recognition of the Quality of Rubber Surfaces

Surface classification of rubbers is needed for so called soot dispersion checks ofrubber quality (Felix, Reddig 1993). Different qualities of the dispersion processcan be observed using a microscope. A high quality dispersion leads to smoothsurfaces whereas a lower quality is given when the surface becomes rough. Figure 5shows an industrial standard classification consisting of six different quality classes.The upper left image in Fig. 5 represents the highest quality class, the lower right

Page 368: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 365

Fig. 5 Six different quality classes of rubber surfaces and a surface to be classified

the lowest quality class. An example of a surface to be classified is given in theright-hand part of the figure.

In this application basic characteristics as average grey levels and grey level gra-dients are used without any additional location orientation. For the surface classifi-cation, location-oriented information is not needed, because of the pure qualitativecharacter of the information being evaluated. Location-oriented information evendisturbs here, as two images with different locations of the same kind of distur-bances represent identical surface qualities. The application has been implementedfor an automotive supplier company producing so-called rubber-metal components.

9.3.2 Recognition of Tire Profiles for Supervision of Tire Flow

In a tire plant many thousands of tires are produced a day. Everyday hundreds ofdifferent tire types are transported by conveyors through the factory from the pro-duction area trough the inspection area to the delivery store. Before entering thedelivery store, the tires have to be supervised in order to ensure that every tire isgoing to be located in the right area in the store. For this purpose every single tirehas to be identified for a supervised entry into the store. An important identificationcriterion is the profile of the tread of the tire. The structural image recognition dis-cussed above has been applied within a tire identification system containing morethan 200 different tire types. Figure 6 shows the user interface of the system usedfor the teach-in procedure of the reference images describing the tire types. In itsrecognition mode the systems works automatically without any user interaction. Itautomatically recognizes the tires and gives the information about each recognizedtire type to the PLC, which is responsible for the control of the conveyors and thesupervision of the material flow. Figure 6 also indicates the grid partitioning thesystem uses in order to represent and evaluate the structure of the profile of the tiretread. The recognition of the profile of the tire tread is solved using the definition ofthe positive relationship of type ‘analogy’ (Def. 4). The recognition is positive if thevalue of the positive relationship reaches a specific threshold.

Page 369: Forging New Frontiers: Fuzzy Pioneers I

366 R. Felix

Fig. 6 Profile Recognition of Tires

10 Conclusions

In this contribution real world applications of the decision making model calledDMRG have been presented. The key issue of this model is the formal definitionof both positive and negative relationships between decision goals. The relation-ships between goals are calculated driven by situation dependent current decisiondata. The real world applications presented above cover a wide range of differentapplication fields. What the applications have in common is that all of them take ad-vantage of the explicit and situation dependent calculation of relationships betweenthe decision goals. This justifies, in the opinion of the author, the statement thatexplicit modeling of relationships between goals is crucial for further developmentof adequate decision making models.

References

1. Bellman RE, Zadeh LA (1970) Decision Making in a Fuzzy Environment. Management Sci-ences 17: 141–164

2. Bezdek JC (1994) Edge Detection Using the Fuzzy Control Paradigm. Proceedings of EU-FIT ’94, Aachen

3. Biswal MP (1992) Fuzzy programming technique to solve multiobjective geometric program-ming problems. Fuzzy Sets and Systems 51: 67–71

4. Dubois D, Prade H (1984) Criteria Aggregation and Ranking of Alternatives in the Frame-work of Fuzzy Set Theory. Studies in Manage ment Sciences 20: 209–240

5. Dubois D, Prade H (1992) Fuzzy Sets and Possibility Theory: Some Applications to Inferenceand Decision Processes. In: Reusch B (ed) Fuzzy Logik - Theorie und Praxis. Springer, Berlin,pp 66–83

Page 370: Forging New Frontiers: Fuzzy Pioneers I

Fuzzy Decision Model Based on Relationships between Goals (DMRG) 367

6. Felix R (1991) Entscheidungen bei qualitativen Zielen. Ph.D. thesis, University of Dortmund,Department of Computer Sciences.

7. Felix R, Reddig S (1993) Qualitative Pattern Analysis for Industrial Quality Assurance. Pro-ceedings of the 2nd IEEE Conference on Fuzzy Systems, San Francisco, USA, Vol. I: 204–206

8. Felix R (1994) Relationships between goals in multiple attribute decision making. Fuzzy Setsand Systems 67: 47–52

9. Felix R, Kretzberg T, Wehner M (1994) Image Analysis Based on Fuzzy Similarities in FuzzySystems. In: Kruse R, Gebhardt J, Palm R (eds) Computer Science, Vieweg, Wiesbaden

10. Felix R (1995) Fuzzy decision making based on relationships between goals compared withthe analytic hierarchy process. Proceedings of the Sixth International Fuzzy Systems Associ-ation World Congress, Sao Paulo, Brasil, Vol. II: 253–256

11. Felix R, Kühlen J, Albersmann R (1996) The Optimization of Capacity Utilization with aFuzzy Decision Support Model. Proceedings of the fiftth IEEE International Conference onFuzzy Systems, New Orleans, USA, pp 997–1001

12. Felix R (2000) On fuzzy relationships between goals in decision analysis. Proceedings of the3rd International Workshop on Preferences and Decisions, Trento, Italy, pp 37–41

13. Felix R (2001) Fuzzy decision making with interacting goals applied to cross-selling deci-sions in the field of private customer banking. Proceedings of the 10th IEEE InternationalConference on Fuzzy Systems, Melbourne, Australia

14. Freeling ANS (1984) Possibilities Versus Fuzzy Probabilities – Two Alternative DecisionAids.In: Fuzzy Sets and Decision Analysis, pp 67–82, Studies in Management Sciences,Vol. 20

15. Gonzalez RC, Thomason MG (1978) Syntactic Pattern Recognition – An Introduction.Addison Wesley

16. Saaty TL (1980) The Analytic Hierarchy Process. Mc Graw-Hill17. Zimmermann HJ (1991) Fuzzy Set Theory and its Applications. Kluver/ Nijhoff

Page 371: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarityin Agricultural Sciences

Marius Calin and Constantin Leonte

Abstract Decision making in environments that do not allow much algorithmicmodeling is not easy. Even more difficulties arise when part of the knowledge isexpressed through linguistic terms instead of numeric values. Agricultural Sciencesare such environments and Plant Breeding is one of them. This is why Fuzzy The-ories can be very useful in building Decision Support Systems in this field. Thischapter presents a number of Fuzzy methods to be used for decision support inthe selection phase of Plant Breeding programs. First, the utilization of the FuzzyMulti-Attribute Decision Model is discussed and a case study is analyzed. Then, twoFuzzy Querying methods are suggested. They can be used when a database must besearched in order to extract the information to be utilized in decision making. In thiscontext, the concept of Fuzzy Similarity is important. Ways to employ it in databasequerying are suggested.

1 Introduction

1.1 Why Use Fuzzy Theories in Agricultural Research?

Fuzzy Theories provide powerful tools to formalize linguistic terms and vague as-sertions. It was long ago accepted that in Biological Sciences much knowledge isexpressed in this form. Being strongly related to the former, Agricultural Sciencesare also fit for such approaches. Many decisions are made not by means of numericvalues, but on the basis of previous expertise, non-numeric evaluation, or linguisti-cally expressed similarities.

An important branch of Agricultural Sciences is Plant Breeding. The basic aimof a Plant Breeding program is to produce a cultivated crop variety (so-calledcultivar) with superior characteristics i.e. higher production levels and/or betteraccommodation to some environmental factors. Recent developments in Genetics

Marius Calin · Constantin LeonteThe “Ion Ionescu de la Brad”, University of Agricultural Sciences and Veterinary Medicine of Iasi,Aleea Mihail Sadoveanu no. 3, Iasi 700490, Romaniae-mail: {mcalin, cleonte}@univagro-iasi.ro

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 369C© Springer 2007

Page 372: Forging New Frontiers: Fuzzy Pioneers I

370 M. Calin, C. Leonte

brought new approaches in Plant Breeding. Detailed description of different geno-types can now be made and, through Genetic Engineering, direct action at DNAlevel has become possible. However, large-scale use of Genetic Engineering forobtaining genetically modified crops is a heavily disputed issue. The danger ofthe unmonitored, widespread use of transgenic seeds was emphasized by numer-ous scientists and organizations [11, 6]. In this context, organic agriculture andorganic food production become more and more popular. These activities en-courage the development of organic plant breeding programs. A new concept,Participatory Plant Breeding (PPB), emerged in the last decade. PPB aims toinvolve [20] many actors in all major stages of the breeding and selection pro-cess, from scientists, farmers and rural cooperatives, to vendors and consumers. Allthese show that classic work procedures in Plant Breeding are far from becomingobsolete.

Within a Plant Breeding program many decision situations occur. The parentalmaterial consists of different sources of cultivated and spontaneous forms (so-calledgermplasm). The breeder’s decisions [13] include: which parents to use, how tocombine them, what, when, and how to select, what to discard, what to keep. Usu-ally, a large number of individuals are taken into consideration. For each of them,a set of characters is measured in order to be used in the selection process. Thenumber of monitored characters can also be large.

As one can see, selection is an important step. Its efficiency is conditioned bymany factors. A very important one is the expert’s ability to identify the individualsthat fulfill the goals for the most of the considered characters. The difficulty ofprocessing a large amount of numerical data in order to make the selection is evi-dent and the usefulness of software tools to aid decision-making and data retrievingdoesn’t need any further argumentation.

However, much of the handled information is expressed through linguistic termsrather then crisp, numerical, values. A plant breeding professional will properly dealwith qualifiers like “low temperature”, “about 50 grains per spikelet” or “mediumbean weight”, despite their imprecise nature. If such terms are to be handled withina software tool, such as a Decision Support System, they must be properly formal-ized. Fuzzy Sets Theory and Fuzzy Logic provide mathematical support for suchformalization.

1.2 Fuzzy Terms, Fuzzy Variables, and Fuzzy Relations

To discuss the topics in the subsequent sections, some well-known definitions [23,17, 16] must be reviewed.

Definition 1. A fuzzy set is a mapping

F : U → [0, 1]. (1)

Page 373: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 371

where for any x ∈ U, F(x) is the membership degree of x in F. The referential setU is said to be the universe of discourse. The family of all fuzzy sets in U can bedenoted F(x).

Definition 2. For a fuzzy set F on U, the following parts are defined [17]:

- the kernel of F KER(F) =def {x |x ∈ U ∧ F(x) = 1}, (2)

- the co-kernel of F COKER(F) =def {x |x ∈ U ∧ F(x) = 0}, (3)

- the support of F SUPP(F) =def {x |x ∈ U ∧ F(x) > 0}. (4)

A linguistic expression can be represented as a fuzzy set whose universe of dis-course is defined by the referred element (temperature, concentration, weight, etc.).This is why the names fuzzy term and linguistic term are considered to be equivalent.The membership function F(x) describes the degree in which, for any value in U ,the linguistic statement may be considered to be true. Several fuzzy terms in thefamily F(x) can be defined to cover the entire universe of discourse with non-zerovalues, that is a fuzzy variable [17].

The definition of fuzzy terms and fuzzy variables can rely on various criterialike subjective belief, prior experience or statistical considerations. In fact, definingthe appropriate fuzzy variables is a topmost step in approaching a problem throughFuzzy Theories. Examples of fuzzy variables defined on different universes of dis-course will be given later on a real world application. Anticipating the discussion,three preliminary remarks can be made.

• Most generally, the membership function F(x) can have any shape. In practicethere are only a few shapes utilized for defining fuzzy terms. The most frequentlyused are the trapezoidal fuzzy sets, also named Fuzzy Intervals, and the triangu-lar fuzzy sets or Fuzzy Numbers [21].

• In most cases, a real world problem leads to modeling the linguistic terms throughnormal fuzzy sets. A fuzzy set F is called normal if there is at least one x ∈ Usuch that F(x) = 1.

• It has been remarked that for defining a fuzzy variable, no more than nine fuzzyterms are generally needed. In many cases three or five are sufficient. This isdirectly related to the human ability to define qualitative nuances.

Linguistic terms and variables are often used in various branches of biologicalsciences.

Definition 3. Let U and V be two arbitrary sets and U×V their Cartesian product.A fuzzy relation (in U×V ) is a fuzzy subset R of U×V .

Remark 1. When U =V (the relation is defined between elements of the same uni-verse), R is said to be a fuzzy relation in U.

Definition 4. Let R ⊆ X×Y and S ⊆ Y×Z be two fuzzy relations. The compositionof the given fuzzy relations, denoted R ◦ S is the fuzzy relation

Page 374: Forging New Frontiers: Fuzzy Pioneers I

372 M. Calin, C. Leonte

R ◦ S(x, z) = supy∈Y

min(R(x, y), S(y, z)) (5)

Departing from the classic concept of relation of equivalence, an extension to FuzzySets Theory [16] was made, that is the relation of similarity. It aims to model andevaluate the "resemblance" between several members of a universe of discourse.

Definition 5. A relation of similarity R in the universe U is a fuzzy relation with thefollowing properties:

- reflexivity: R(x, x) = 1, x ∈ U. (6)

- symmetry: R(x, y) = R(y, x), x, y ∈ U. (7)

- transitivity: R ◦ R ⊆ R, i.e.

R(x, z) ≥ supy∈U

min(R(x, y), R(y, z)) x, z ∈ U (8)

Definition 6. A relation R is a relation of tolerance [21] iff it has the properties ofreflexivity and symmetry.

When the universe U is finite, U = {x1, . . . , xn}, a fuzzy relation R can be de-scribed through a n × n matrix AR = (ai j )i, j=1,...,n . The elements ai j express therelation between the corresponding pairs of members of the universe U . They havethe property 0 ≤ ai j ≤ 1. If R is reflexive, then aii = 1, i = 1, . . ., n. If R issymmetrical, then ai j = a j i , i, j = 1, . . . , n.

A classical relation of equivalence allows the partition of the universe in classesof equivalence. In the case of fuzzy relations of similarity, defining the correspond-ing classes of similarity is possible, although handling them proves difficult. Unlikeclasses of equivalence that are disjoint,classes of similarity can overlap. Moreover,when defining a fuzzy relation of similarity in a real world application, it is quitedifficult to fulfill the property of transitivity. This is why attempts were focused ondefining fuzzy similarity, rather then fuzzy equivalence [16, 21, 25]. One of thesedefinitions [25] is used in a subsequent section to build a fuzzy database query-ing method dedicated to assist decision in the selection phase of a plant breedingprogram.

Departing from Definition 5, another way to define a fuzzy relation of similarityis to build it as the limit of a sequence of fuzzy relations [16]. Corresponding classesof similarity can be then defined. The departing point is a reflexive and symmetricalfuzzy relation. In practice, the actual problem is to define this relation in a conve-nient way. In Sect. 3.3 a method is presented to express the matching degree of twoarbitrary intervals through a fuzzy relation having the properties of reflexivity andsymmetry, that is a fuzzy relation of tolerance. This matching degree can be used inassessing the “equivalence” of interval valued characters between a prototype plantand other varieties of the same kind.

Page 375: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 373

2 Applying Fuzzy Decision in Plant Breeding

2.1 The Fuzzy Multi-Attribute Decision Model

A decision problem that can be approached using the Multi-Attribute DecisionModel (MADM) is expressed by means of an m × n decision matrix:

C1 C2 . . . Cn

X1 x11 x12 . . . x1nX2 x21 x22 . . . x2n. . . . . . . . . . . . . . .Xm xm1 xm2 . . . xmn

The lines of the decision matrix stand for m decision alternatives X1, . . . , Xm

that are considered within the problem and the columns signify n attributes or crite-ria according to which the desirability of an alternative is to be judged. An elementxi j expresses in numeric form the performance rating of the decision alternative Xi

with respect to the criterion C j . The aim of MADM is to determine an alternativeX∗ with the highest possible degree of overall desirability. There are several pointsof view in designating X∗. The maximin and maximax methods are two classicalapproaches (short description and examples in [7]).

Classic MADM works with criteria that allow exact, crisp, expression (e.g. tech-nical features of some device). But within an MADM, the decision-maker couldexpress the criteria C j as linguistic terms, which leads to the utilization of fuzzysets to represent them. Thus, a Fuzzy MADM is achieved and the element xi j ofthe decision matrix becomes the value of a membership function: it represents thedegree in which the decision alternative Xi satisfies fuzzy criterion C j .

To find the most desirable alternative, the decision-maker uses an aggregationapproach, which generally has two steps [24]:

• Step 1. For each decision alternative Xi , the aggregation of all judgements withrespect to all goals.

• Step 2. The rank ordering of the decision alternatives according to the aggregatedjudgements.

The aforesaid maximin and maximax methods for classic MADM are such aggre-gation approaches that use the min and max operators. For example, in the maximinmethod in step 1, the minimum value form each line of the decision matrix is picked,then, in step 2, the maximum of the previously found m values is determined. Movingtowards the Fuzzy MADM domain, Bellman and Zadeh [1] introduced the Principleof Fuzzy Decision Making stating the choice of the “best compromise alternative”applying the maximin method to the membership degrees expressed by xi j .

There are numerous other methods that use diverse aggregation operators, otherthan min and max. Not all of them were particularly intended for solving MADMproblems, but many can be used in step 1. Authors enumerate different types of

Page 376: Forging New Frontiers: Fuzzy Pioneers I

374 M. Calin, C. Leonte

averaging operators [7, 24] appropriate for modeling real world situations werepeople have a natural tendency to apply averaging judgements to aggregate crite-ria. A special mention must be made for the ordered weighted averaging (OWA)operators. These were introduced by Yager [22], and have the basic properties ofaveraging operators but are supplementary characterized by a weighting function.

A simple and frequently used method to perform step 1 of the MADM, the ag-gregation for each decision alternative Xi , is the calculation of an overall rating Ri

using relation (9):

Ri =n∑

j=1

w j xi j , i = 1, . . . , m. (9)

where each w j is a weight expressing the importance of criterion C j . Having theseratings computed, step 2 simply consists of sorting them in descending order.

2.2 The Analytic Hierarchy Process

The problem in calculating Ri is to evaluate the weights w j . Usually they expressa subjective importance that the decision-maker assigns to the respective criteria. Afixed weight vector is difficult to find as human perception and judgment can varyaccording to information inputs or psychological states of the decision maker [7].

For the evaluation of the subjective weights w j , Saaty [18] proposed an intuitivemethod: AHP (Analytic Hierarchy Process). It is based on a pair wise comparison ofthe n criteria. The decision maker builds a matrix A = [ai j ]n×n whose akl elementsare assigned with respect to the following two rules:

• Rule 1. akl = 1alk

, k = 1, . . . , n, l = 1, . . . , n;• Rule 2. If criterion k is more important then criterion l, then assign to criterion k

a value form 1 to 9 (see in Table 1 a guide for assigning such values).

The weights w j are determined as the components of the normalized eigenvec-tor corresponding to the largest eigenvalue of the matrix A. This requires solvingthe equation det[A − λI ] = 0 which might be a difficult task when A is a large

Table 1 Saaty’s scale of relative importance

Intensity of relative importance Definition

1 equal importance3 weak importance5 strong importance7 demonstrated importance9 absolute importance2, 4, 6, 8 intermediate values

Page 377: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 375

matrix, that is, many decision criteria are considered. An alternative solution [7]that gives a good approximation of the normalized eigenvector components, but iseasier to apply, consists of calculating the normalized geometric means of the rowsof matrix A.

2.3 Applying the Fuzzy MADM within a Plant Breeding Program

As shown above, every plant-breeding program involves a selection phase. Whenchoosing the parents for the next stages, a number of characters are observed andmeasured. The result of these measurings is a list of numeric values: one row foreach plant and one column for each character. Afterwards, this list must be examinedin order to decide which plants to select. It is a large one, because the number ofexamined individuals is usually great and numerous characters are measured. Thisis why a decision support tool can be useful and MADM can underlie it. Moreover,the utilization of fuzzy techniques is appropriate as the specialist often deals withlinguistic values. This leads to the Fuzzy MADM.

The assisted selection methodology [4] has seven steps that are described furtheron. The application of each step will be illustrated through an example that consistsof selecting the parents from a set of 150 plants of pod bean, variety Cape.

The study aimed to evaluate the potential of this cultivar in view of adapting itto new site factors. Ten characters were measured for each plant: height, numberof branches, number of pods, average length of pods, average diameter of pods,number of beans, average number of beans per pod, weight of beans, average weightof beans per pod, weight of 1000 beans. Table 2 contains the results obtained foronly 10 out of 150 plants. It suggests the difficulty of dealing with the entire list of1500 entries.

Table 2 Results of measuring some characters of 10 plants of the pod bean variety Cape

ID H (cm) Brs P/pl Avg. p.l.(cm)

Avg. p.d.(cm)

B/pl Avg.B/pd

Bw/pl Avg.Bw/pd

W1000(g)

1 29 5 10 9.80 0.78 32.00 3.20 10.20 1.02 3182 43 7 8 8.12 0.68 16.00 2.00 4.20 0.52 2623 35 7 13 9.84 0.76 40.00 3.07 13.30 1.02 3324 39 6 12 9.25 0.75 40.00 3.33 11.00 0.91 2755 41 10 25 11.04 0.73 102.00 4.08 30.50 1.22 2996 46 8 18 10.33 0.75 58.00 3.22 15.80 0.87 2727 32 8 17 8.64 0.64 43.00 2.52 12.00 0.70 2798 31 6 7 9.85 0.85 27.00 3.85 7.50 1.07 2789 38 7 16 8.18 0.56 30.00 1.87 10.20 0.63 34010 32 7 10 10.00 0.88 39.00 3.90 10.50 1.05 269

H Height; Brs Branches; P/pl Pods per plant; Avg.p.l. Average pod length; Avg.p.d. Average poddiameter; B/pl Beans per plant; Avg. B/pd Average beans per pod; Bw/pl Beans weight per plant;Avg. Bw/pd Average beans weight per pod; W 1000 Weight of 1000 beans.

Page 378: Forging New Frontiers: Fuzzy Pioneers I

376 M. Calin, C. Leonte

1.3%

13.3%

45.3%

29.3%

9.3%

1.3%0.0%

0%

10%

20%

30%

40%

50%

0 – 7

7 – 8.5

8.5 – 10

10 – 11

.5

11.5 – 13

13 – 14

.5

14.5 – 16

16 – 17

.5

Average length of pods per plant (cm)

Fig. 1 The variability of the average length of pods per plant

2.3.1 Defining the Linguistic Variables

Linguistic variables can be defined through various means. One method is usingthe information obtained from statistical study. Such studies are always performedwithin plant breeding programs.

In the case study, ten fuzzy variables were defined, one for each observed char-acter. The maximum number of fuzzy terms within a fuzzy variable was five. Thefuzzy terms were modeled using trapezoidal fuzzy sets.

The following example describes the fuzzy variable length_of_pods that was de-fined by examining the results of the statistical study. The fuzzy variable regards the(average) length of pods per plant and comprises three fuzzy terms: low, medium,high.

Figure 1 shows the variability chart of the character average length of pods perplant (cm). Figure 2 illustrates the corresponding fuzzy variable length_of_pods.

The construction of these linguistic variables is advisable even though they maynot always be used by the plant breeder. For some selection criteria, other linguisticevaluations might be preferred.

0

0.2

0.4

0.6

0.8

1

5 6 7 8 9 10 11 12 13 14 15Average length of pods per plant (cm)

Low

Medium

High

Fig. 2 Representation of the fuzzy variable length_of_pods

Page 379: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 377

2.3.2 Stating the Selection Criteria

According to the situation, the selection criteria can be stated using fuzzy terms orcrisp criteria. However, as the case study shows, there could be criteria that seem tohave a crisp definition, but must also be fuzzyfied. The selection criteria used in thecase study are enumerated further on.

• Height: normal. This linguistic value was chosen because the statistical study didnot reveal some special correlation between height and the other characters.

• Number of branches: great. A great number of branches determines a great num-ber of pods.

• Number of pods: great. This being a pod bean variety, this is the most importantcriterion.

• Average length of pods: medium. Further processing requirements and marketingreasons imposed this linguistic value.

• Average diameter of pods: medium. Same reasons as above.• Number of beans: medium. Even though it’s a pod bean variety, beans are impor-

tant for taste and nutritional parameters. Too small or too large number of beanscan lower the overall quality.

• Average number of beans per pod: medium. Same reasons as above.• Weight of beans: approximately 15–20 g. The selection criterion was in this case

defined as an interval rather then using a fuzzy term of the fuzzy variable Weightof beans. This choice was determined by the results acquired [14] in studying thestatistical correlation between this character and the others. Figure 3 shows sucha correlation.

A fuzzyfication of this criterion was also made to allow degrees of variable confi-dence on both sides of the interval 15–20. The corresponding trapezoidal fuzzy setis shown in Fig. 4.

y = –0.4963x2 + 19.055x + 118.93

R2 = 0.5314

0

50

100

150

200

250

300

350

0 5 10 15 20 25 30 35Weight of beans/plant (g)

Wei

ght 1

000

bean

s (

g)

Fig. 3 Correlation between the weight of beans per plant and the weight of 1000 beans

Page 380: Forging New Frontiers: Fuzzy Pioneers I

378 M. Calin, C. Leonte

0

0.2

0.4

0.6

0.8

1

0 1 5 10 15 20 25 30

Weight of beans per plant (g)

Small

Medium

Large

15 - 20

Fig. 4 Fuzzyfication of the interval 15–20

• Average weight of beans per pod: approximately between 1 and 1.5 g. Similarreasons as before determined the choice of a fuzzyfied interval.

• Weight of 1000 beans: big. This parameter ensures a high quality of beans; thisis why high values are desirable.

2.3.3 Applying the Selection Criteria

By applying the selection criteria expressed as fuzzy sets, one obtains the values ofthe corresponding membership functions, that is the measure in which each plantsatisfies each criterion. In Table 3 some results for the individuals within the casestudy are shown. The same plants as in Table 2 were chosen.

Table 3 Results of applying the selection criteria

ID H Brs P/pl Avg. p.l. Avg. p.d. B/pl Avg. B/pd Bw/pl Avg. Bw/pd W1000

1 0.33 0. 0. 1. 1. 1. 1. 0.04 1. 0.362 0. 0. 0. 0.75 1. 0.4 0.5 0. 0. 0.3 0.67 0. 0. 1. 1. 1. 1. 0.66 1. 0.644 0. 0. 0. 1. 1. 1. 1. 0.2 0.64 0.5 0. 1. 0. 1. 1. 0. 0.96 0. 1. 0.6 0. 0. 1. 1. 1. 0.8 1. 1. 0.48 0.7 1. 0. 1. 1. 1. 1. 0.76 0.4 0. 0.8 1. 0. 0. 1. 0.5 1. 1. 0. 1. 0.9 0. 0. 1. 0.79 0.6 1. 0.44 0.04 0. 0.810 1. 0. 0. 1. 0.2 1. 1. 0.1 1. 0.

H Height; Brs Branches; P/pl Pods per plant; Avg.p.l. Average pod length; Avg.p.d. Average poddiameter; B/pl Beans per plant; Avg. B/pd Average beans per pod; Bw/pl Beans weight per plant;Avg. Bw/pd Average beans weight per pod; W 1000 Weight of 1000 beans.

2.3.4 Calculating the Weights of the Selection Criteria

One of the most time consuming steps of this methodology is the two by twocomparison of the selection criteria in view of determining the elements of Saaty’s

Page 381: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 379

matrix from which the weights expressing the relative importance of each criterionare subsequently computed. Table 4 illustrates the Saaty’s matrix for the selectioncriteria in the case study. Next to matrix are the corresponding calculated weights.As one can see, the most important character is thenumber of pods per plant. On thenext two places are other important characters: the diameter of pods and the lengthof pods. Thus, the calculus gives a numerical expression to some obvious intuitiveideas.

2.3.5 Computing the Overall Ratings for Decision Alternatives

The overall ratings are computed according to relation (9) defined earlier. Thus,a final score will be assigned to each studied plant. The maximum possible scorewould be equal to the sum of the weights. For the studied pod bean variety, thiscalculation involves the membership degrees shown partially in Table 3 and theweights shown in Table 4. Relation (9) becomes

Ri =10∑

j=1

w j xi j , i = 1, . . . , 150. (10)

The maximum score that one plant can obtain is, in this case study, 3.919442.

Table 4 The elements Saaty’s matrix and the computed weights of importance

Chr H Brs P/pl Avg. p.l. Avg. p.d. B/pl Avg. B/pd Bw/pl Avg. Bw/pd W1000 Weight(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)

(1) 1 1/3 1/7 1/5 1/5 1 1/3 1/3 1/3 1/3 0.10502(2) 3 1 1/5 1/3 1/3 5 5 3 3 1 0.39971(3) 7 5 1 4 1 3 1 7 7 7 1(4) 5 3 1/4 1 1 5 2 3 2 3 0.58489(5) 5 3 1 1 1 5 3 3 2 5 0.73633(6) 1 1/5 1/3 1/5 1/5 1 1 1 1 1 0.16855(7) 3 1/5 1 1/2 1/3 1 1 3 3 5 0.35437(8) 3 1/3 1/7 1/3 1/3 1 1/3 1 1 1 0.18050(9) 3 1/3 1/7 1/2 1/2 1 1/3 1 1 3 0.21848(10) 3 1 1/7 1/3 1/5 1 1/3 1 1/3 1 0.17152

Chr Character; H Height; Brs Branches; P/pl Pods per plant; Avg.p.l. Average pod length; Avg.p.d.Average pod diameter; B/pl Beans per plant; Avg. B/pd Average beans per pod; Bw/pl Beansweight per plant; Avg. Bw/pd Average beans weight / pod; W 1000 Weight of 1000 beans.

2.3.6 Ranking the Individuals

The final hierarchy is determined by sorting the overall ratings in descending order.The top ten final results for the case study are listed in Table 5. On the first place isthe individual having the ID number 110 that attained a final score of 3.57, that is91.12% of the maximum possible score.

Page 382: Forging New Frontiers: Fuzzy Pioneers I

380 M. Calin, C. Leonte

Table 5 The final hierarchy (top 10 places)

Place ID Score Percent

1 110 3.57 91.12%2 39 3.37 85.86%3 28 3.01 76.81%4 117 2.95 75.16%5 64 2.64 67.43%6 31 2.63 67.04%7 96 2.61 66.59%8 21 2.57 65.55%9 149 2.52 64.34%10 51 2.51 64.16%

2.3.7 Selecting the Parents

The plant-breeding expert will choose as many individuals from the top of the finalclassification as he considers appropriate. A graphical representation of the resultscan help in making a good decision. Such a diagram for the studied pod beans isshown in Fig. 5.

2.4 Conclusion

As shown in this section, the selection phase of a plant-breeding program can beviewed as a Fuzzy Multi-Attribute Decision problem. A methodology to apply thismodel was proposed. Its main advantages consist in the possibility of stating the

3.919422

0.00

0.50

1.00

1.50

2.00

2.50

3.00

3.50

4.00

Decision alternatives

Sco

re v

alu

es

0%

25%

50%

75%

100%

Perc

ent fro

m m

axi

mum

sco

re

SCORE MAXIMUM %

Maximum score obtainedt:

3.571389 (91.12%)

Fig. 5 The diagram showing the final ranking of the decision alternatives

Page 383: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 381

selection criteria through linguistic terms. Some selection criteria could be ex-pressed by intervals instead of linguistic values. However, these intervals shouldalso be transformed in fuzzy sets to ensure a better utilization. The results of thestatistical study performed on the sets of plants involved can be used to define thefuzzy variables, as such statistical studies are present in all plant breeding programs.

3 Similarity Measures in Fuzzy Database Querying

3.1 Why Employ Fuzzy Database Querying in Plant Breeding?

In many decision situations large amounts of data must be inspected in order toextract information that would suggest the course of action. An usual IT approach isto store the respective data in a relational database (RDB). Its exploitation consistsin applying queries to select those records that match the requirements in a givencontext. The resulting set of records, the dynaset, would play the role of decisionalternatives that are to be ranked in order to determine the best solution.

However, one can point out some limits of the conventional RDB approach. Ina conventional query, the implicit assumption is that the searched values are knownexactly: the search parameters must be expressed in a crisp form, such as strings,values or intervals. Moreover, at the simplest level, query criteria are connectedthrough logical AND, meaning that a record must match all of them in order to beselected. At least two drawbacks arise from here. On the one hand, when referringto different attributes, the database user often deals with linguistic terms (like much,approximately, etc.) instead of crisp values. On the other hand, a crisp formulation ofthe search criteria might lead to omissions in the dynaset. For example, one appliesa query stating: (attribute) A between 0.7 and 1.4. A record having A = 0.65 wouldbe rejected, even though this value could be quite satisfactory, considering otherpoints of view that prevail. For example, attribute values B and C could make thisrecord very suitable for the overall purpose.

In the decades that followed the inception of Fuzzy Theories due to Zadeh [23],important research efforts were made towards involving this theoretical support inmodeling different facets of vagueness in database management. Advances weremade in the general framework, named fuzzy database management [9], which cov-ered various scopes: fuzzy querying of classic databases [10], storing and handlinglinguistic values, generalizing the RDB model [15], indexing of fuzzy databases [2].An important part of these efforts concentrated upon introducing and implementingquery techniques for intelligent data retrieving (e.g. [10, 12]). As literature shows,fuzzy querying is almost always linked to other two important topics: fuzzy decisionand fuzzy similarity. These concepts appear together in many research projects en-gaged in intelligent data manipulation. Data retrieving is often involved in selectingdecision alternatives, as shown in the preceding sections. Matching between searchcriteria and stored data, one or both being represented through fuzzy sets, meansevaluating degrees of similarity. Moreover, fuzzy similarity and fuzzy decision ofteninterfere to each other [21].

Page 384: Forging New Frontiers: Fuzzy Pioneers I

382 M. Calin, C. Leonte

In this context, the procedure introduced in the previous section can be regardedas a querying method using fuzzy criteria that applies to a database table containingpunctual values. The next sections introduce two query methods that use linguisticcriteria and apply to attributes stored as numeric intervals.

3.2 A Fuzzy Querying Method for Interval Valued Attributes

Many quantitative characters of different cultivars are officially recorded as intervalsor linguistic values. Even those expressed as single numbers represent averagesrather then exact values. For instance, the standard length of leaves is officiallyrecorded for any variety of vine. Examples: Grenache Noir - short that is 11–13 cm;Chenin Blanc - medium that is 16–18 cm.

When designing a plant breeding program, the objectives are defined in a ratherlinguistic manner: plants having between 60 and 80 cm in height at a medium growthrate, producing at least 5 kg of fruits, etc. These goals are expressed accordingto different factors (costs, quality level, etc.) and rely on previous expertise, sta-tistical observation, technical literature and others. Again, linguistic terms can beexpressed as fuzzy variables. Yet, even exact intervals and single values are to beput as fuzzy numbers of trapezoidal and triangular shape. This is because there arealways ranges of variable confidence on both sides of the declared range or num-ber. In fact, levels of approximation are often present when expressing biologicalcharacteristics.

Such a situation could impose applying a linguistic query whose parameters areexpressed as fuzzy intervals to a database with crisp attribute values.

A crisp interval I can be viewed as a special case of fuzzy interval having theKER(I ) = SUPP(I ) that is a rectangular membership function. Therefore, thematching between a query criterion and an actual value in the database can beexpressed as the similarity of two fuzzy sets [5].

There are many ways to define fuzzy similarity. One of them is the model usedwithin the FuzzyCLIPS1, an extended version of the CLIPS2 rule-based shell forexpert systems. FuzzyCLIPS [25] uses two distinct inexact concepts: fuzziness anduncertainty. Fuzziness means that the facts can be not only exact statements, butlinguistic assertions represented by fuzzy sets. Furthermore, every fact (fuzzy ornon-fuzzy) has an associated Certainty Factor (CF) that expresses the degree ofcertainty about that piece of information. The value of CF lies between 0 and 1,where 1 indicates complete certainty that the fact is true. A FuzzyCLIPS simplerule has the form:

1 FuzzyCLIPS was developed by the Knowledge Systems Laboratory, Institute for InformationTechnology, National Research Council of Canada.

2 CLIPS was developed by the Artificial Intelligence Section, Lyndon B. Johnson Space Center,NASA.

Page 385: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 383

if A then C CFrA′ CFf

−−−−−−−−−−−−−−−−−−−−−C′ CFc

where

A is the antecedent of the rule,A′ is the matching fact in the fact database,C is the consequent of the rule,C′ is the actual consequent,CFr is the certainty factor of the rule,CFf is the certainty factor of the fact,CFc is the certainty factor of the conclusion.

Three types of simple rules are defined in FuzzyCLIPS:

• CRISP_ rules that have a crisp fact in the antecedent;• FUZZY_CRISP rules that have a fuzzy fact in the antecedent and a crisp one in

the consequent;• FUZZY_FUZZY rules that have fuzzy facts both in the antecedent and in the

consequent.

To perform a query with fuzzy criteria, the model of FUZZY_CRISP rule was cho-sen. In a FUZZY_CRISP rule:

• A and A′ are fuzzy facts, not necessarily equal, but overlapping;• C′ is equal to C;• CFc is calculated using the following relation:

CFc = CFr · CFf · S. (11)

In relation (11), S is the measure of similarity between the fuzzy sets FA (deter-mined by the fuzzy fact A) and FA′ (determined by the matching fact A′). Similarityis based on the measure of possibility P and the measure of necessity N. These termshave the following definitions:

For the fuzzy sets FA : U → [0, 1], and FA′ : U → [0, 1]

S = P(FA|FA′ ) if N(FA|FA′ ) > 0.5, (12)

S = (N(FA|FA′ )+ 0.5) · P(FA|FA′ )) otherwise, (13)

Page 386: Forging New Frontiers: Fuzzy Pioneers I

384 M. Calin, C. Leonte

where

P(FA|FA′ ) = max(min(FA(x), FA′ (x))) ∀(x ∈ U), (14)

N(FA|FA′ ) = 1− P(F A|FA′ ). (15)

F A is the complement of FA:

F A(x) = 1-FA(x), ∀(x ∈ U). (16)

A search criterion may be put as a FUZZY_CRISP simple rule. For example:

if (A between k1 and k2) then (A_OK) CFr 0.8 .

In this example

• the antecedent A between k1 and k2 is represented by a fuzzy set having thekernel [k1, k2] and an user-defined support (s1, s2); it plays the role of A in theFuzzyCLIPS model (see for example Fig. 6);

• the consequent A_OK is a crisp fact, meaning that, from the user’s point of view,the mentioned range of A meets the requirements;

• the value of the certainty factor of the rule means that, when necessary, a valuesmaller than 1 can be assigned to CFr to express some doubt about the validity ofthe rule. However, the default value should be CFr = 1.

The database containing the recorded specifications provides corresponding in-tervals that play the role of A′, the fact involved in the matching process. Conse-quently, for the fact A′, a rectangular membership function should be used. This facthas CFf = 1. To illustrate, Fig. 6 shows the (database) interval [a1, a2] representedas a rectangular fuzzy set.

The result of this FUZZY_CRISP inference is the fact A_OK with a calculatedCFc meaning that, with respect to attribute A, the record meets the requirementsto the extent expressed by CFc. The value of CFc is computed according to re-lation (11). As discussed, in this relation CFr = 1, and, except for some specialsituations, CFf =1. Therefore, the most important factor in the calculation of C Fc

Fig. 6 Graphical illustrationfor Case 1

0

1

Rule

Database

s1 s2

k2

a2a1

k1

Page 387: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 385

is S, which is in fact the only necessary factor in most situations. Hence, for the sakeof simplicity, the implicit value CFr = 1 will be considered.

The overall conclusion is that, following the model of FUZZY_CRISP rules, thematching degree of a fuzzy interval query parameter to the corresponding intervalstored in the database can be expressed by calculating the measure of similarity Sbetween the respective trapezoidal and rectangular fuzzy sets. Moreover, as a resultof the search process, this method provides computed degrees of similarity betweeneach query parameter and the corresponding database attribute value. This addi-tional information can be used in subsequent evaluation and ranking of alternativeswithin a Decision Support System as shown in a previous section.

The computing procedure described by relations (12) to (16) could seem quitelaborious, but a closer look shows that the calculations need to be performed onlyin a reduced number of cases. This is shown in the following case discussion.

Case 1

• Rule: if (A between k1 and k2) then (A_OK) CFr = 1• Fact in the database: a1 ≤ A ≤ a2 with [a1, a2] ⊆ [k1, k2]

The calculation of S does not actually need to be performed. This case corre-sponds to the situation of a classical query: the kernel of the fuzzy fact FA′ fromthe database is included in the kernel of the rule (query) FA, which means that theinterval in the database entirely fulfils the conditions of the query. Thus, the valueof S can be directly set to 1.

Case 2

• Rule: if (A between k1 and k2) then (A_OK) CFr = 1• Fact in the database: a1 ≤ A ≤ a2 , with k1 < a1 < k2 and k2 < a2 <

k2+s22

Taking into account the position of a2 and using relation (14), geometrical con-siderations lead to

P(F A |F ′A) < 0.5 .

Then, from (15),

N(FA|FA′ ) > 0.5,

which entails from (12)

S = P(FA|FA′ ).

From the position of a1 it results

P(FA|FA′ ) = 1,

Page 388: Forging New Frontiers: Fuzzy Pioneers I

386 M. Calin, C. Leonte

that is

S = 1 .

Finally,

CFc = CFr · CFf · S = 1 .

The numerical example shown in Fig. 7 illustrates the considered situation, that is:

P(F A |F ′A) = 0.33⇒ N(FA|FA′ ) = 0.66 > 0.5⇒ S = P(FA|FA′ ) = 1

In Case 2 the kernel of the fuzzy set FA′ is not included in the kernel of FA, butthe “deviation” is very slight and it does not affect the eligibility of the record. Thissituation is expressed by the calculated value S = 1. One can see that in a “classic”crisp querying process, such a record would have been rejected because the actualinterval in the database did not match the query criterion.

Case 3

• Rule: if (A between k1 and k2) then (A_OK) CFr = 1• Fact in the database: a1 ≤ A ≤ a2 , with k1 < a1 < k2 and k2+s2

2 ≤ a2 < s2

Taking into account the position of a2 and using relation (14), geometrical con-siderations lead to

P(F A |F′A) > 0.5.

Then, from (15),

N(FA|FA′ ) < 0.5,

which entails from (13)

S = (N(FA|FA′ )+ 0.5) ∗ P(FA|FA′ ).

From the position of a1 we have:

Fig. 7 Numerical examplefor Case 2

0

1

RuleDatabase

0.7 1.6

1.4

1.451

0.9

Page 389: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 387

P(FA|FA′ ) = 1,

that is

S < 1 .

Finally,

CFc = CFr · CFf · S < 1 .

The numerical example shown in Fig. 8 illustrates the considered situation, that is:

P(F A|F ′A) = 0.66⇒ N(FA|FA′ ) = 0.33 < 0.5⇒⇒ S = (N(FA|FA′ )+ 0.5)∗P(FA|FA′ ) = 0.83

Again, the kernel of the fuzzy set FA′ is not included in the kernel of FA whichmeans that in a crisp querying process, the database record would have been re-jected. Instead, the calculated value of S (in the numerical example, S = 0.83) canbe interpreted as the degree in which the query criterion is satisfied by the actualinterval in the database.

Case 4

• Rule: if (A between k1 and k2) then (A_OK) CFr = 1• Fact in the database: a1 ≤ A ≤ a2, with s1 < a1 < k2 and a2 > s2

Taking into account the position of a2 (shown in Fig. 9) and using relation (14),geometrical considerations lead to

P(F A |F ′A) = 1.

Then, from (15),

N(FA|FA′ ) = 0,

and from (13)

Fig. 8 Numerical examplefor Case 3

0

1

RuleDatabase

0.7 1.6

1.4

1.551

0.9

Page 390: Forging New Frontiers: Fuzzy Pioneers I

388 M. Calin, C. Leonte

0

1

Rule

Database

s1 s2

k2

a2a1

k1

Fig. 9 Graphical illustration for Case 4

S = (N(FA|FA′ )+ 0.5) ∗ P(FA|FA′ ) .

From the position of a1 :

P(FA|FA′ ) = 1 .

Making the respective replacements we have S = 0.5, and finally

CFc = CFr · CFf · S = 0.5.

As well as Case 1, this is another situation that must be treated in the conventionalmanner. Otherwise, the calculated value of S could enforce a misleading conclusion,because one might consider that S = 0.5 represents an acceptable value. In fact, thedatabase record must be rejected (as it would have been in a conventional query,too) without performing any computation for S.

The discussed cases show that the validity of a criterion can be judged accordingto the following three situations which are mutually exclusive:

• KER(FA′ ) ⊆ KER(FA) The criterion matches in thecrisp sense; set S to 1.

• KER(FA′ ) ⊆ SUPP(FA) and KER(FA′ ) �⊂KER(FA) Matching in the fuzzy sense;compute the value of S usingrelations (12) to (16).

• KER(FA′ )∩ COKER (FA) �= ∅ The criterion does notmatch; reject the record.

The preceding discussion has only analyzed those cases that occur at one end ofthe matching intervals, but the final conclusions are valid for any combined situa-tions occurring at both ends.

Page 391: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 389

3.3 A Fuzzy Relation for Comparing Intervals

This section presents a method [3] to express the matching degree of two arbitraryintervals through a fuzzy relation having the properties of reflexivity and symmetry,that is a fuzzy relation of tolerance. This method could be useful in situations inwhich one has to compare two entities whose attributes are (partially) expressedthrough intervals, and to give a measure of their matching degree. Such a real worldsituation can occur in plant breeding or in other agricultural disciplines when twocultivars must be compared in order two express some sort of resemblance. Forexample, one wishes to search in a database for equivalents of some given cultivar(that already exists or is designed to be obtained through breeding). Those charactersexpressed as intervals can be compared using the described procedure. Parts of themethod can also be used in different other situations.

Numerous methods for comparing intervals are currently known. The followingapproach is based on geometrical remarks regarding the two partial aspects of thetopic: the length comparison and the measure of concentricity. Each of them leads toa reflexive and symmetrical fuzzy relation. The overall measure of intervals match-ing is obtained by combining these two partial measures. Each of them can also beused separately wherever necessary.

3.3.1 Matching

Let [a, b] and [c, d] be two intervals that must be compared in order to computea measure of their matching. Let this matching degree be match([a, b]; [c, d]).The aim is to define match as a non-dimensional measure having the followingproperties:

Property 1.

0 ≤ match([a, b]; [c, d]) ≤ 1, (17)

where

match([a, b]; [c, d])= 0 (18)

means no matching at all, and

match([a, b]; [c, d])= 1 iff a = c and b = d (19)

means a perfect match .

Property 2. (reflexivity).

match([a, b]; [a, b])= 1 (20)

Page 392: Forging New Frontiers: Fuzzy Pioneers I

390 M. Calin, C. Leonte

Property 3. (symmetry).

match([a, b]; [c, d])= match([c, d]; [a, b]) (21)

Definition 7. The matching degree is calculated as:

match([a, b]; [c, d])= round([a, b]; [c, d]) · conc([a, b]; [c, d]) (22)

where

• round([a, b]; [c, d]) measures the degree in which the lengths of the two intervalsmatch,

• conc([a, b]; [c, d]), expresses the degree of concentricity of the intervals.

Remark 2. The name round stands for roundness, and conc stands for concentricity.

The partial measures round and conc are defined further on.

3.3.2 Roundness

In relation (22), round denotes the roundness of a rectangle whose sides have thelengths b−a and d−c, respectively. The measure roundness of a shape was used [8]in literature within image processing applications. For two dimensions this measureis defined as follows.

Let R be a two-dimensional (2D) region. The roundness of region R can be com-puted using the following procedure:

• consider a reference circle C2D having

– the centre in the centre of gravity of region R ;

- the radius r =√

area(R)

π(23)

• compute

roundness2D(R) = max

(1− area(C2D ∩ R)+ araa(C2D ∩ R)

araa(R), 0

)(24)

• normalize to the interval [0, 1]

When region R is a rectangle, denoted ABCD, two special cases appear. Theseare shown in Fig. 10 where α1 = ∠EGP and α2 = ∠JGQ. There are two more casesthat correspond to Fig. 10, in which AB > BC, but the lines of reasoning are the

Page 393: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 391

D

E

A

C

F

B

P G

K

L

A B

D C

Gα1α1

P

E

F

HJ

α2

L

KQ

Fig. 10 Computation of the roundness of rectangle ABCD

same. Figure 11 shows the special case in which the rectangle ABCD is a square; asit can be expected, this is the case of maximum roundness.

Denoting roundness(ABCD) the roundness of the rectangle ABCD, and denoting

X = area(C2D ∩ ABCD)+ area(C2D ∩ ABCD)

area(ABCD), (25)

relation (24) for roundness becomes

roundness(ABCD) = max(1-X, 0). (26)

According to Fig. 10, two distinct cases appear.Case a (left side of Fig. 10):

area(C2D ∩ ABCD

) = 2 · area(EFP), (27)

area(C2D ∩ ABCD

) = area(ABCD)-area(C2D)+ 2 · area(EFP), (28)

X = 4 · area(EFP)

area(ABCD), (29)

area(EFP) = area(ABCD)

π(α1 − sin α1 cos α1). (30)

X = 4

π(α1 − sin α1 cos α1). (31)

Fig. 11 Square is “theroundest rectangle”

D

A B

C

G

Page 394: Forging New Frontiers: Fuzzy Pioneers I

392 M. Calin, C. Leonte

Case b (right side of Fig. 10). In this case we obtain

X = 4

π(α1 − sin α1 cos α1 + α2 − sin α2 cos α2). (32)

Then, α1 and α2 must be computed.

cos α1 = GP

EG= GP

radius(C2D)= AB

2

√π

AB · BC= 1

2√

πp, (33)

where

p = AB

BC. (34)

Thus,

α1 = arccos

(1

2

√πp

), (35)

and, through similar calculations,

α2 = arccos

(1

2

√π

p

). (36)

From Table 6, where a few computed values of roundness are listed, some con-clusions can be drawn:

i) roundness(ABCD) becomes zero when the longer side becomes five times biggerthan the shorter one, approximately;

ii) roundness(ABCD) reaches its maximum, roundness(ABCD) ≈ 0.82, for thesquare; the normalization in [0, 1] is shown in the last column of Table 6.Another important conclusion comes from relations (31), (32), (35) and (36):

iii) roundness(ABCD) only depends on the ratio between the height and width ofthe rectangle.

The component round ([a, b]; [c, d]) in relation (22) will be expressed usingthe roundness of the rectangle that has the two dimensions equal to the lengths ofthe compared intervals, [a, b] and [c, d] respectively. Conclusion iii) drawn above isimportant because it shows the symmetry of round([a, b]; [c, d]). The other propertyof round([a, b]; [c, d]) must be reflexivity. It comes from ii) and the consequentnormalization, which entails

round([a, b]; [c, d])= 1 iff a = c and b = d.

This is why a normalization in [0, 1] of roundness was necessary.

Page 395: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 393

Table 6 Computed values of the roundness of a rectangle and their normalization in [0, 1]

p = AB/BC 1/p α1 (rad.) α2 (rad.) round normalization

0.2 5 1.16 0 0 00.21 4.75 1.15 0 0.006183 0.007550470.66 1.5 0.76 0 0.665968 0.813239480.78 1.27 0.66 0 0.769158 0.939248260.8 1.25 0.65 0.14 0.778422 0.95056131 1 0.48 0.48 0.818908 11.25 0.8 0.14 0.65 0.778422 0.95056131.27 0.78 0 0.66 0.769158 0.939248261.5 0.66 0 0.76 0.665968 0.813239484.75 0.21 0 1.15 0.006183 0.007550475 0.2 0 1.16 0 0

3.3.3 Concentricity

The other element in calculating the matching degree (22) is the concentricity of thetwo intervals one from another,conc([a, b]; [c, d]). The concentricity is determinedby the centers of the intervals. In Fig. 12 some cases are shown. The intervals areplaced on the real-number axis, in its positive region, but this does not restrict thegenerality. The centers of [a, b] and [c, d] have the co-ordinates m and m′, respec-tively.

To express the concentricity, the points shown in Fig. 13 were placed in a Carte-sian system of co-ordinates. The points A, B and C, D are placed on Oy as if theintervals [a, b] and [c, d] were represented on the vertical axis. Again, the twointervals were assumed to be in the positive region of the axis, but this does notaffect the generality.

Denoting lab = b − a the length of interval [a, b] and lcd = d − c the length of[c, d], for the points in Fig. 13, the following relations and co-ordinates are defined:

a) [a, b ] ≠ [c, d ] m ≠ m'

b) [a, b ] ≠ [c, d ] m = m'

c) [a, b ] = [c, d ] m = m'

ma bO

m' dc

ma bO

c m' d

ma bO

c m' d

Fig. 12 Different positions of the centers of two intervals

Page 396: Forging New Frontiers: Fuzzy Pioneers I

394 M. Calin, C. Leonte

lab = b − a = yB − yA lcd = d − c = yD − yC

A (0, yA) B (0, yB) A’ ((lab + lcd )/2, yA) B’ ((lab + lcd )/2, yB)

C (0, yC) D (0, yD) C’ ((lab + lcd )/2, yD) D’ ((lab + lcd )/2, yD)

O’ (0, (yA + yB)/2) M’ ((lab + lcd )/4, (yC + yD)/2)

Having defined the points shown in Fig. 13, the concentricity conc is defined asthe cosine of δ = ∠MO’M’ :

conc([a, b]; [c, d])= cos δ. (37)

Elementary calculations lead to

cos δ = 1√1+ tg2δ

, where tgδ = 2

lab + lcd[(yC + yD)− (yA + yB)] (38)

Concentricity conc([a, b]; [c, d]) = cos δ has the following properties:

i) 0 < conc([a, b]; [c, d]) ≤ 1ii) conc([a, b]; [a, b]) = 1

iii) conc([a, b]; [c, d]) = conc([c, d]; [a, b])iv) conc([a, b]; [c, d]) = conc([a, b]; [c-α , d+α])

Property i) shows the normalization of conc, in [0, 1]. Property ii) shows thereflexivity of conc, which means that when the current interval is equal to the ref-erence one, conc is equal to 1. Property iii) shows the symmetry of conc that is,when computing the concentricity of two intervals, it doesn’t matter which one isthe reference.

Fig. 13 Construction of theangle δ expressing theconcentricity of two intervals

O'

A

B

C

D

M

M'

B'

D'

C'

A'

δ

O

Page 397: Forging New Frontiers: Fuzzy Pioneers I

Applying Fuzzy Decision and Fuzzy Similarity in Agricultural Sciences 395

In addition, from ii) and iv) we may conclude that concentricity is equal to 1 forany two intervals that have the same centre (m = m′ in Fig. 12). This means thatconc is not sufficient to express the whole matching degree between two intervals;it must be used together with round in the evaluation of the matching degree.

3.4 Conclusions

The utilization of the Fuzzy technique described in Sect. 3.2 allows linguistic for-mulations of the query criteria for interval valued attributes. Compared to the con-ventional methods, it reduces the risk of rejecting records that can be valuable forthe final purpose. Degrees of acceptability are also computed for the respective at-tributes in each selected record.

In Sect. 3.3, a reflexive and symmetrical fuzzy relation was defined for comparingtwo intervals. It can also be used as a query technique on a database having intervalvalued attributes. Moreover, a matching degree between the two intervals is alsocomputed.

The computed values of similarity that respectively result from applying themethods described in Sects. 3.2 and 3.3 can be used to automate further decision-making within a Multi-Attribute Decision Making procedure, as discussed in Sect. 2.

References

1. Bellman RA, Zadeh LA (1970) Decision-making in a fuzzy environment, Management Sci-ences, Ser. B 17 141–164

2. Boss B, Helmer S (1996) Indexing a Fuzzy Database Using the Technique of SuperimposedCoding - Cost Models and Measurements (Technical Report of The University of Mannheim)www.informatik.uni-mannheim.de/techberichte/html/TR-96-002.html

3. Calin M, Galea D (2001) A Fuzzy Relation for Comparing Intervals. In: B. Reusch (Ed.), Com-putational Intelligence. Theory and Applications : International Conference, 7th Fuzzy DaysDortmund, Germany, October 1–3, 2001, Proceedings. Lecture Notes in Computer Science.Vol. 2206 / 2001 p. 904. Springer-Verlag Heidelberg

4. Calin M, Leonte C (2001) Aplying the Fuzzy Multi-Attribute Decision Model in the Selec-tion Phase of Plant Breeding Programs. In: Farkas, I., (ed.) Artificial Intelligence in Agricul-ture 2001. Proc. of The 4th IFAC Workshop Budapest, Hungary, 6–8 June 2001. pp. 93–98.Elsevier Science

5. Calin M, Leonte C, Galea D (2003) Fuzzy Querying Method With Application in Plant-Breeding. Proc. of The International Conference “Fuzzy Information Processing - FIP 2003”,Beijing, China. Springer and Tsinghua University Press

6. Coghlan A (2006) Genetically modified crops: a decade of disagreement. New Scientist, 2535,21 Jan 2006

7. Fullér R (2003) Soft Decision Analysis. Lecture notes http://www.abo.fi/∼rfuller/lsda.html8. Hiltner J (1996) Image Processing with Fuzzy Logic. Summer School "Intelligent Technolo-

gies and Soft Computing", Black Sea University, Romania, Sept. 22–28, 1996.9. Kackprzyk J (1997) Fuzziness in Database Management Systems. Tutorial at the Conference

“5th Fuzzy Days”, Dortmund, Germany10. Kacprzyk J (1995) Fuzzy Logic in DBMSs and Querying 2nd New Zealand Two-Stream

International Conference on Artificial Neural Networks and Expert Systems (ANNES ’95)Dunedin, New Zealand

Page 398: Forging New Frontiers: Fuzzy Pioneers I

396 M. Calin, C. Leonte

11. Lammerts van Bueren E, Osman A (2001) Stimulating GMO-free breeding for organic agri-culture: a view from Europe. LEISA Magazine, December 2001

12. Lelescu, A. Processing Imprecise Queries in Database Systems http://www.cs.uic.edu/∼alelescu/eecs560_project.html

13. Leonte C. (1996) Horticultural Plant Breeding (Published in Romanian). Publisher: EDP,Bucharest, Romania

14. Leonte C, Târdea G, Calin M (1997) On the correlation between different quantitative char-acters of Cape pod bean variety (published in Romanian). Lucrari stiintifice, Horticultura,USAMV Iasi, Romania, Vol. 40, 136–142

15. Medina JM, Pons O, Vila MA (1994) GEFRED. A Generalized Model of Fuzzy RelationalDatabases Technical Report #DECSAI-94107, Approximate Reasoning and Artificial Intelli-gence Group (ARAI) Granada University, Spain, http://decsai.ugr.es/difuso/fuzzy.html

16. Negoita CV, Ralescu DA (1974) Fuzzy Sets and Applications. Editura Tehnica, Bucharest,(published in Romanian)

17. Reusch B (1996) Mathematics of Fuzzy Logic. In: Real World Applications of IntelligentTechnologies (H-J. Zimmermann, M. Negoita, D. Dascalu eds.). Publishing House of the Ro-manian Academy, Bucharest, Romania

18. Saaty TL (1980) The Analytic Hierarchy Process: Planning, Priority Setting, Resource Allo-cation. McGraw-Hill, New York

19. Serag-Eldin G, Nikravesh M (2003) Web-Based BISC Decision Support System. 2003 BISCFLINT-CIBI International Joint Workshop on Soft Computing for Internet and Bioinformatics.UC Berkley, 15–19 December 2003

20. Sperling L, Ashby JA, Smith ME, Weltzien E, McGuire S (2001) A Framework for AnalysingParticipatory Plant Breeding Approaches and Results. Participatory Plant Breeding – A Spe-cial Issue of EUPHYTICA, Vol. 122(3)

21. Williams J, and Steele N (2002) Difference, Distance and Similarity as a basis for fuzzy deci-sion support based on prototypical decision classes, Fuzzy Sets and Systems 131 35–46

22. Yager RR (1988) Ordered weighted averaging aggregation operators in multi-criteria decisionmaking, IEEE Trans. on Systems, Man and Cybernetics, 18(1988) 183–190

23. Zadeh LA (1965) Fuzzy Sets. Information and Control, 8: 338–35324. Zimmermann H-J (1996) Fuzzy Decision Support Systems, In: Real World Applications of

Intelligent Technologies (H-J. Zimmermann, M. Negoita, D. Dascalu eds.). Publishing Houseof the Romanian Academy, Bucharest, Romania

25. *** (1994) FuzzyCLIPS Version 6.02A. User’s Guide. Knowledge Systems Laboratory, Insti-tute for Information Technology, National Research Council, Canada

Page 399: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry:Current State of the Art and Future Trends

Arthur Kordon

Abstract The chapter summarizes the current state of the art of applying soft com-puting solutions in the chemical industry, based on the experience of The DowChemical Company and projects the future trends in the field, based on the expectedfuture industrial needs. Several examples of successful industrial applications of dif-ferent soft computing techniques are given: automated operating discipline, basedon fuzzy logic; empirical emulators of fundamental models, based on neural net-works; accelerated new product development, based on genetic programming, andadvanced process optimization, based on swarm intelligence. The impact of theseapplications is a significant improvement in process operation and in much fastermodeling of new products.

The expected future needs of industry are defined as: predictive marketing, ac-celerated new products diffusion, high-throughput modeling, manufacturing at eco-nomic optimum, predictive optimal supply-chain, intelligent security, reduce virtualbureaucracy, emerging simplicity, and handling the curse of decentralization. Thefuture trends that are expected from the soft computing technologies, which maysatisfy these needs, are as follows: perception-based modeling, integrated systems,universal role of intelligent agents, models/rules with evolved structure, and swarmintelligence-based process control.

1 Introduction

One of the current trends in introducing emerging technologies in industry is thegrowing acceptance of different approaches of computational intelligence in generaland of soft computing in particular. This trend is represented in several recently pub-lished books. An overview of different industrial implementations is given in (Jainand Martin 1999), numerous soft computing applications in the area of intelligentcontrol are shown in (Zilouchian and Jamshidi 2001). A recent summary of various

Arthur KordonEngineering&Process Sciences, Core R&DThe Dow Chemical Company, 2301 N Brazosport Blvd, Freeport, TX 77541, USAe-mail: [email protected]

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 397C© Springer 2007

Page 400: Forging New Frontiers: Fuzzy Pioneers I

398 A. Kordon

soft computing applications in fish processing, facilities layout planning, mobileposition estimation in cellular systems, etc. is given in (Karray and De Silva 2004).

Of special importance for the future development of soft computing is its suc-cessful introduction into big global corporations in key industries. The focus ofthis chapter is to show exactly this case and to present a systematic view of thecurrent state of the art of soft computing applications, the open issues that need tobe overcome, and the future trends in this area, based on the experience of a bigglobal corporation, such as The Dow Chemical Company. Although limited to thechemical industry, the presented view is typical for many other industries based onmanufacturing and new product development.

The proposed future trends in soft computing industrial applications are basedon the accumulated experience from the past successes and failures and predictedfuture industrial needs. One can define this approach as “supply-demand” methodof predicting, where the future demands in industry is expected to be satisfied bythe supply of research approaches and ideas. From the predicted future industrialneeds the attention is focused on those that could be resolved by soft computingor computational intelligence technologies. The hypothesis is that these needs willnavigate the direction of future research in soft computing and inspire new ideasand solutions. It is assumed, though, that independently of the “supply side”, re-search will be driven also by its internal scientific mechanisms, such as combiningevolutionary development and revolutionary paradigm shifts (Barker 1989). How-ever, predicting the future trends on the “supply side” is not a subject of this chap-ter. This topic is covered in different papers from academic authors. An exampleof defining the future directions of Genetic Fuzzy Systems is given in (Herrera2005).

In predicting the future trends it is important to emphasize the critical role ofindustry in the short history of soft computing. For example, the successful indus-trial applications of fuzzy logic in Japan in mid 70s gave a decisive support for thedevelopment of this approach in a moment when many researchers had rejected theidea or had serious doubts about its scientific value (Karray and De Silva 2005). Insimilar ways, the fast acceptance in industry and the demonstrated value from manysuccessful applications contributed to the accelerated growth in the research areas ofneural networks and evolutionary computation. It is expected that industry will playthis role in the future and this assumption is at the basis of the “supply-demand”method of prediction.

However, in order to make the predictions more accurate, some specific non-technical issues have to be taken into account. Industrial R&D has a faster dynamicsand is a subject to more frequent structural changes than the academic environment.As a result, it is more sensitive to the speed of acceptance of a new technology,from several different stakeholders, such as researchers, process engineers, man-agers, etc. That increases significantly the importance of proper solutions to variousorganizational and political issues for promoting the technology from research tothe business domain. Without addressing these issues there is always a potential forreducing the support or even questioning the rationale of using soft computing.

The chapter is organized in the following manner. The starting point of ouranalysis is based on the current state of the art of soft computing in the chemical

Page 401: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 399

industry, illustrated with typical examples in Sect. 2. The open issues in softcomputing industrial applications, which will affect the future trends, are discussedin Sect. 3 followed by defining the projected industrial needs in Sect. 4. The futuretrends in soft computing industrial applications, driven by expected industrial needs,are described in Sect. 5.

2 Current State of The Art of Soft Computingin the Chemical Industry

The chemical industry has some specific issues, such as:

• high-dimensionality of plant data (thousands of variables and control loops)• increased requirements for model robustness toward process changes due to the

cyclical nature of the industry• multiple optima• key process knowledge is in process operators and poorly documented• high uncertainty in new material market response, etc.

One of the observed current tendencies in chemical industry is saturation with avariety of modeling solutions in manufacturing. As a result of the intensive modelingefforts in the last 20 years many manufacturing processes in the most profitableplants are already supplied by different types of models (steady-state, dynamic,model predictive control, etc.). This creates a culture of modeling “fatigue” andresistance to introduction of new solutions, especially based on unknown technolo-gies. This makes the efforts of applying soft computing systems in industry espe-cially challenging since demonstration of significant competitive advantages relativeto the alternative modeling and hardware solutions is required.

One area where soft computing systems have a clear competitive advantage is inthe development of simple empirical solutions in terms of models and rules. In sev-eral industrial applications it has been shown that the models generated by soft com-puting are a low-cost alternative to both high fidelity models (Kordon et al. 2003a)and expensive hardware analyzers (Kordon et al. 2003b).

Different soft computing technologies have been explored by several researchersin The Dow Chemical Company since the late 80s – early 90s (Gerules at al. 1992;Kalos 1992; Kalos and Rey 1995). However, the real breakthrough in applying thesetechnologies was achieved after consolidating the available resources in a special-ized research group in 2000. It resulted in various successful applications in thearea of inferential sensors, automated operating discipline, accelerated new productdevelopment, and advanced process optimization. The key accomplishments andapplication areas are summarized in (Kotanchek et al. 2002; Kordon et al. 2004;Kordon et al. 2005). Examples of successful industrial applications from differentsoft computing technologies are given below:

Page 402: Forging New Frontiers: Fuzzy Pioneers I

400 A. Kordon

2.1 Fuzzy Logic Application

Operating discipline is a key factor for competitive manufacturing. Its main goalis to provide a consistent process for handling all possible situations in the plant.It is the biggest knowledge repository for plant operation. However, the documen-tation for operating discipline is static and is detached from the real-time data ofthe process. The missing link between the dynamic nature of process operation andthe static nature of operating discipline documents is traditionally carried out by theoperations personnel. However, this makes the existing operating discipline processvery sensitive to human errors, competence, inattention, or lack of time to consultthe documentation.

One approach to solving the problems associated with operating discipline andmaking it adaptive to the changing operating environment is to use real-time hy-brid intelligent systems. Such type of a system was successfully implemented in alarge-scale chemical plant at The Dow Chemical Company (Kordon at al. 2001).It is based on integrating experts’ knowledge with soft sensors and fuzzy logic.The hybrid system runs in parallel with the process; it detects and recognizesproblem situations automatically in real-time; it provides a user-friendly inter-face so that operators can readily handle complex alarm situations; it suggests theproper corrective actions via a hyper-book; and it facilitates effective shift-to-shiftcommunication.

The processed data are then supplied to feature detection, by testing against athreshold or a range. Every threshold contains a membership function as defined bythe fuzzy logic approach. The values for the parameters of the membership functionare based either on experts’ assessment or on statistical analysis of the data.

2.2 Neural Networks Application

The execution speed of the majority of the complex first-principle models is tooslow for real time operation. One effective solution is to emulate a portion of thefundamental model, by a neural network or symbolic regression model, called emu-lator, built only with selected variables, related to process optimization. The data forthe emulator are generated by design of experiments from the first-principle model.In this specific case, neural networks are an appropriate solution because they willoperate within the training range. Usually the fundamental model is representedwith several simple emulators, which are implemented on-line. One interesting ben-efit of emulators is that they can be used as fundamental model validation indica-tors as well. Complex model validation during continuous process changes requirestremendous efforts in data collection and numerous model parameters fitting. It ismuch easier to validate the simple emulators and to infer the state of the complexmodel on the basis of the high correlation between them. An example of emulatorapplication for intermediate product optimization between two chemical plants inThe Dow Chemical Company is given in (Kordon at al. 2003a).

Page 403: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 401

2.3 Evolutionary Computation Application

The key creative process in fundamental model building is hypothesis search. Unfor-tunately, the effectiveness of hypothesis search depends very strongly on creativity,experience, and imagination of model developers. The broader the assumption space(i.e., the higher the complexity and dimensionality of the problem), the larger thedifferences in modeler’s performance and the higher the probability for ineffectivefundamental model building.

In order to improve the efficiency of hypothesis search and to make the funda-mental model discovery process more consistent, a new “accelerated” fundamentalmodel building sequence was proposed. The key idea is to reduce the fundamentalhypothesis search space by using symbolic regression, generated by Genetic Pro-gramming (GP) (Koza 1992). The main steps of the proposed methodology areshown in Fig. 1.

The key difference from the classical modeling sequence is in running simu-lated evolution before beginning the fundamental model building. As a result ofGP-generated symbolic regression, the modeler can identify key variables and as-sess the physical meaning of their presence/absence. Another significant side ef-fect from the simulated evolution is the analysis of the key transforms with highfitness that persist during the GP run. Very often some of the transforms havedirect physical interpretation that can lead to better process understanding at thevery early phases of fundamental model development. The key result from theGP-run, however, is the list of potential nonlinear empirical models in the formof symbolic regression. The expert may select and interpret several empirical solu-tions or repeat the GP-generated symbolic regression until an acceptable model isfound.

The fundamental model building step 5 is based either on a direct use of empiricalmodels or on independently derived first principle models induced by the resultsfrom the symbolic regression. In both cases, the effectiveness of the whole modelingsequence could be significantly improved.

The large potential of genetic programming (GP)-based symbolic regressionfor accelerated fundamental model building was demonstrated in a case study forstructure-property relationships (Kordon at al. 2002). The generated symbolic so-lution was similar to the fundamental model and was delivered with significantlyless human efforts (10 hours vs. 3 months). By optimizing the capabilities for ob-taining fast and reliable GP-generated functional solutions in combination with thefundamental modeling process, a real breakthrough in the speed of new productdevelopment can be achieved.

2.4 Swarm Intelligence Application

Process optimization is an area where soft computing technologies can make almostimmediate economic impact and demonstrate value. Since the early 90s various

Page 404: Forging New Frontiers: Fuzzy Pioneers I

402 A. Kordon

evolutionary computation methods, mostly genetic algorithms, have been success-fully applied in industry, including The Dow Chemical Company. Recently, a newapproach, Particle Swarm Optimization (PSO) is found to be very attractive for in-dustrial applications. The main attractiveness of PSO is that it is fast, it can handlecomplex high-dimensional problems, it needs a small population size, and it is sim-ple to implement and use. Different types of PSO have been explored in The DowChemical Company. A hybrid PSO and Levenberg-Marquardt method was used forquick screening of complicated kinetic models (Katare et al. 2004). The PSO suc-cessfully identified the promising regions of parameter space that are then optimizedlocally. A different, multi-objective PSO was investigated in (Rosca et al. 2004)and applied for real-time optimization of a color spectrum of plastics based on 15parameters.

1. Problemdefinition

2. Run symbolicregression

3. Identify keyfactors&transforms

6. Select&verify thefinal model solution

7. Validatethe model

GP

4. Select GPgenerated models

5. Construct firstprinciple models

Fig. 1 Accelerated new product development by using Genetic Programming

Page 405: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 403

3 Open Issues in Soft Computing Industrial Applications

The ultimate success of an industrial application requires support from and inter-action with multiple people (operators, engineers, managers, researchers, etc.). Be-cause it may interfere with the interests of some of these participants, human factors,even “politics”, can play a significant role (Kordon 2005). On one hand, there aremany proponents of the benefits of soft computing who are ready to cooperate andtake the risk to implement this new technology. Their firm commitment and enthusi-asm is a decisive factor to drive the implementation to the final success. On the otherhand, however, there are also a lot of skeptics who do not believe in the real value ofsoft computing and look at the implementation efforts as a “research toy exercise”.In order to address these issues a more systematic approach in organizing the devel-opment and application efforts is recommended. The following key organizationaland political issues are discussed below.

3.1 Organizational Issues

1) Critical mass of developers: It is very important at this early phase of industrialapplications of soft computing to consolidate the development efforts. The probabil-ity for success based only on individual attempts is very low. The best-case scenariois to create a virtual group that includes not only specialists, directly involved in softcomputing development and implementation, but also specialists with complemen-tary areas of expertise, such as machine learning, expert systems, and statistics.

2) Soft computing marketing to business and research communities: Since softcomputing is virtually unknown not only to business-related users but to most ofthe other research communities as well, it is necessary to promote the approach bysignificant marketing efforts. Usually this research-type of marketing includes se-ries of promotional meetings based on two different presentations. The one directedtoward the research communities focuses on the “technology kitchen”, i.e., givesenough technical details to describe the soft computing technologies, demonstratesthe differences from other known methods, and clearly illustrates their competi-tive advantages. The presentation for the business-related audience focuses on the“technology dishes”, i.e., it demonstrates with specific industrial examples the typesof applications that are appropriate for soft computing, describes the work processto develop, deploy, and support a soft computing application, and illustrates thepotential financial benefits.

3) Link soft computing to proper corporate initiatives: The best case scenario werecommend is to integrate the development and implementation efforts within theinfrastructure of a proper corporate initiative. A typical example is the Six Sigmainitiative, which is practically a global industrial standard (Breyfogle III 2003). Inthis case the organizational efforts will be minimized since the companies have al-ready invested in the Six Sigma infrastructure and the implementation process isstandard and well known.

Page 406: Forging New Frontiers: Fuzzy Pioneers I

404 A. Kordon

3.2 Political Issues

1) Management support: Consistent management support for at least several yearsis critical for introducing any emerging technology, including soft computing. Thebest way to win this support is to define the expected research efforts and to assessthe potential benefits from specific application areas. Of decisive importance, how-ever, is the demonstration of any value creation by resolving practical problems assoon as possible.

2) Skepticism and resistance toward soft computing technologies: There are twokey sources with this attitude. The first source is the potential user in the businesseswho is economically pushed more and more toward cheap, reliable, and easy-to-maintain solutions. In principle, soft computing applications require some trainingand significant cultural change. Many users are reluctant to take the risk even ifthey see a clear technical advantage. A persistent dialog, supported with economicarguments and examples of successful industrial applications is needed. Sometimessharing the risk by absorbing the development cost by the research organization is agood strategy, especially in applications that could be easily leveraged.

The second source of skepticism and resistance is in the research communityitself. For example, the majority of model developers prefers the first principleapproaches and very often treats the data-driven methods as “black magic” whichcannot replace solid science. In addition, many statisticians express serious doubtsabout some of the technical claims in different soft computing technologies. Theonly winning strategy to change this attitude is by more intensive dialog and findingareas of common interest. An example of a fruitful collaboration between funda-mental model developers and soft computing developers is given in the previous sec-tion and described in (Kordon at al. 2002). Recently, a joint effort between statisti-cians and soft computing developers demonstrated significant improvement in usinggenetic programming in industrial statistical model building (Castillo et al. 2004).

3) Lack of initial credibility: As relatively new approach to industry, soft comput-ing does not have a credible application history for convincing a potential industrialuser. Almost any soft computing application requires a high-risk culture and signifi-cant communication efforts. The successful examples, discussed in this chapter, area good start to gain credibility and increase the soft computing potential customerbase in the chemical industry.

4 Projected Industrial Needs Related to Soft ComputingTechnologies

In defining the expected industrial needs in the next 10–15 years one has to takeinto account the increasing pressure for more effective and fast innovation strate-gies (George et al. 2005). One of the key consequences is the expected reducedcost and time of exploring new technologies. As a result, priorities will be given tothose methods, which can deliver practical solutions with minimal investigation and

Page 407: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 405

development efforts. This requirement pushes toward better understanding of notonly the unique technical capabilities of a specific approach, but also to focus atten-tion on evaluating the total cost-of-ownership (potential internal research, softwaredevelopment and maintenance, training, and implementation efforts, etc.). Only ap-proaches with clear competitive advantage will be used for solving the problemsdriven by the new needs in industry.

The projected industrial needs are based on the key trends of increased glob-alization, outsourcing, progressively growing computational power, and expandingwireless communication. The selected needs are summarized in the mind-map inFig. 2 and are discussed below:

4.1 Predictive Marketing

Successful acceptance of any new product or service by the market is criticalfor industry. Until now most of the modeling efforts are focused on product dis-covery and manufacturing. Usually the product discovery process is based on newcomposition or technical features. Often, however, products with expected attractivenew features are not accepted by the potential customers. As a result, big losses arerecorded and the rationale of the product discovery and the credibility of the relatedtechnical modeling efforts are questioned.

One of the possible solutions to this generic issue in industry is by improvingthe accuracy of market predictions with better modeling. Recently, there are a lotof activities in using the enormous amount of data on the Internet for customer

Fig. 2 A mind-map of the future industrial needs

Page 408: Forging New Frontiers: Fuzzy Pioneers I

406 A. Kordon

characterization (Swartz 2000) and modeling by intelligent agents (Said and Bouron2001). Soft computing plays a significant role in this new growing new type ofmodeling. The key breakthrough, however, can be achieved by modeling customer’sperceptions. The subject of marketing is the customer. In predicting her response ona new product, the perception of the product is at the center point of the decisionmaking process. Unfortunately, perception modeling is still an open area of research.However, this gap may create a good opportunity to be filled by the new approach insoft computing – perception-based computing (Zadeh 1999, 2005). With the grow-ing speed of globalization of special importance is predicting the perception of anew product in different cultures. If successful, the impact of predictive marketingto any types of industry will be enormous.

4.2 Accelerated New Products Diffusion

The next big gap to be filled with improved modeling is the optimal new productsdiffusion, assuming a favorable market acceptance. The existing modeling effortsin this area varied from analytical methods (Mahajan et al. 2000) to agent-basedsimulations (Bonabeau 2002). The objective is to define an optimal sequence ofactions to promote a new product into research and development, manufacturing,supply-chain, and different markets. There are different solutions for some of thesesectors (Carlsson et al. 2002). What is missing, however, is an integrated approachacross all sectors.

4.3 High-Throughput Modeling

Recently, high-throughput combinatorial chemistry is one of the leading ap-proaches for generating innovative products in the chemical and pharmaceuticalindustries (Cawse 2003). It is based on intensive design of experiments through acombinatorial sequence of potential chemical compositions and catalysts on small-scale reactors. The idea is by fast data analysis to find new materials. The bottleneck,however, is the speed and quality of model generation. It is much slower than thespeed of experimental data generation. Soft computing could increase the efficiencyof high-throughput research by adding additional modeling methods that could gen-erate empirical dependencies and capture the accumulated knowledge during theexperimental work.

4.4 Manufacturing at Economic Optimum

Most of existing manufacturing processes operate under model predictive con-trol systems or well-tuned PID controllers. It is assumed that the setpoints of the

Page 409: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 407

controllers are calculated based on either optimal or sub-optimal conditions. How-ever, in majority of the cases the objective functions include technical criteria andare not directly related to economic profit. Even when the economics is explicitlyused in optimization, the resulted optimal control is not adequate to fast changesand swings in raw material prices. The high dynamics and sensitivity to local eventsof global economy require process control methods that continuously fly over themoving economic optimum. It is also desirable to include a predictive componentbased on economic forecasts.

4.5 Predictive Optimal Supply-Chain

As a consequence of increased outsourcing of manufacturing and the growingtendency of purchasing by the Internet, the share of supply-chain operations in totalcost have significantly increased (Shapiro 2001). One of the challenges in the futuresupply-chain systems is the exploding amount of data they need to handle in realtime because of the application of the Radio Frequency Identification Technology(RFID). It is expected that this new technology will require on-line fast patternrecognition, trend detection, and rule definition on very large data sets. Of spe-cial importance are the extensions of supply-chain models to incorporate demandmanagement decisions and corporate financial decisions and constraints. It is obvi-ously suboptimal to evaluate strategic and tactical supply chain decisions withoutconsidering marketing decisions that will shift future sales to those products andgeographical areas where profit margins will be highest. Without including the fu-ture demand forecasts the supply-chain models lack predictive capability. Currentefforts in supply chain are mostly focused on improved data processing, analy-sis, and exploring different optimization methods (Shapiro 2001). What is miss-ing in the supply-chain modeling process is including the growing experience ofall participants in the process and refining the decisions with the methods of softcomputing.

4.6 Intelligent Security

It is expected that with the technological advancements in distributed computingand communication the demand for protecting security of information will grow.Even at some point the security challenges will prevent the mass-scale applicationsof some technologies in industry because of the high risk of intellectual propertytheft, manufacturing disturbances, even process incidents. One technology in thiscategory is distributed wireless control systems that may include many communicat-ing smart sensors and controllers. Without secure protection of the communication,however, it is doubtful that this technology will be accepted in manufacturing evenif it has clear technical advantages.

Page 410: Forging New Frontiers: Fuzzy Pioneers I

408 A. Kordon

Soft computing can play a significant role in developing sophisticated systemswith build-in intelligent security features. Of special importance are the co-evolutionary approaches based on intelligent agents (Skolicki et al. 2005).

4.7 Reduced Virtual Bureaucracy

Contrary to the initial expectations of improved efficiency by using global vir-tual offices, an exponentially growing role of electronic communication not relatedto creative work activities is observed. Virtual bureaucracy enhances the classicalbureaucratic pressure with new features like bombardment of management emailsfrom all levels in the hierarchy; transferring the whole communication flow to anyemployee in the organization; obsession with numerous virtual training programs,filling electronic feedback forms, or exhaustive surveys; continuous tracking of bu-reaucratic tasks and pushing the individual to complete them at any cost, etc. Asa result, the efficiency of creative work is significantly reduced. Soft computingcannot eliminate the root cause of this event, which is based on business culture,management policies, and human nature. However, by modeling management deci-sion making and analyzing efficiency in business communication, it is possible toidentify criteria for bureaucratic content of messages. It could be used in protect-ing the individual from virtual bureaucracy in a similar way as the spam filters areoperating.

4.8 Emerging Simplicity

An analysis of survivability of different approaches and modeling techniques inindustry shows that the simpler the solution the longer it is used and the lower theneed to be replaced with anything else. A typical case is the longevity of the PIDcontrollers which still are the backbone of manufacturing control systems. Accord-ing to a recent survey, PID is used in more than 90% of practical control systems,ranging from consumer electronics such as cameras to industrial processes such aschemical processes (Li et al. 2006). One of the reasons is their simple structure thatappeals to the generic knowledge of process engineers and operators. The definedtuning rules are also simple and easy to explain.

Soft computing technologies can deliver simple solutions when multi-objectiveoptimization with complexity as explicit criterion is used. An example of this ap-proach with many successful applications in the chemical industry is by usingsymbolic regression generated by Pareto Front Genetic Programming (Smits andKotanchek 2004). Most of the implemented empirical models are very simple (seeexamples in (Kordon et al. 2006).) and were easily accepted by process engineers.Their maintenance efforts were low and the performance over changing processconditions was acceptable.

Page 411: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 409

4.9 Handling the Curse of Decentralization

The expected growth of wireless communication technology will allow mass-scale introduction of “smart” component of many industrial entities, such as sen-sors, controllers, packages, parts, products, etc. at relatively low cost. This tendencywill give a technical opportunity for design of totally decentralized systems withbuild-in self-organization capabilities. On the one hand, this could lead to designof entirely new flexible industrial intelligent systems capable of continuous struc-tural adaptation and fast response to changing business environment. On the otherhand, however, the design principles and reliable operation of a totally distributedsystem of thousands of communicating entities of diverse nature is still a technicaldream. Most of the existing industrial systems avoid the curse of decentralizationby their hierarchical organization. However, this imposes significant structural con-straints and assumes that the designed hierarchical structure is at least rational ifnot optimal. The problem is the static nature of the designed structure, which is instriking contrast to the expected increased dynamics in industry in the future. Onceset, the hierarchical structure of industrial systems is changed either very slowly ornot changed at all, since significant capital investment is needed. As a result, thesystem does not operate effectively or optimally in dynamic conditions.

The appearance of totally decentralized intelligent systems of free communicat-ing sensors, controllers, manufacturing units, equipment, etc., can allow emergenceof optimal solutions in real time and effective response to changes in the businessenvironment. Soft computing technologies could be the basis in the design of suchsystems and intelligent agents could be the key carrier of the intelligent componentof each distributed entity.

5 Future Trends of Soft Computing Driven by ExpectedIndustrial Needs

An analysis of the expected industrial needs shows that there will be a shift of de-mand from manufacturing-type process related models to business-related models,such as marketing, product diffusion, supply-chain, security, etc. As was discussedin Sect. 2, there is saturation in manufacturing with different models and most of theexisting plants operate close to the optimum. The biggest impact could be in usingsoft computing for modeling the human-related activities in the business process. Bytheir nature they are based on fuzzy information, perceptions, and imprecise com-munication and soft computing techniques are the appropriate methods to addressthese modeling challenges.

The selected future trends in soft computing industrial applications is shown inFig. 3 and discussed below.

Page 412: Forging New Frontiers: Fuzzy Pioneers I

410 A. Kordon

Fig. 3 A mind-map of the future trends in soft computing industrial applications

5.1 Perception-Based Modeling

The recent development of Prof. Zadeh in the area of computing with words, pre-cisiated natural language, and the computational theory of perceptions (Zadeh 1999,2005).] are the theoretical foundation of perception-based modeling. Predictive mar-keting could be a key application area of this approach. However, the theoreticaldevelopment has to be gradually complemented with available software tools forreal world applications. A representative test case for proof of concept applicationis also needed.

5.2 Integrate & Conquer

The current trend of integrating different approaches of soft computing will con-tinue in the future. Most of existing industrial applications are not based on a singlesoft computing technique and require several methods for effective modeling. Anexample of successful integration of neural networks, support vector machines, andgenetic programming is given in (Kordon 2004). The three used techniques com-plement each other to generate the final simple solution. Neural networks are usedfor nonlinear variable selection and reduce the number of variable. Support vectormachines condense the data points to those with significant information content, i.e.,the support vectors. Only the reduced and informational-rich data set is used for thecomputationally intensive genetic programming. As a result, the model developmenttime and cost is significantly reduced. More intensive integration efforts are neededin the future with inclusion of the new developments in soft computing. This trendis a must for satisfying any of the needs, discussed in the previous section.

Page 413: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 411

5.3 Universal Role of Intelligent Agents

The growing role of distributed systems in the future requires a software entitythat represents a system, is mobile and has intelligent capabilities to interact, pursuegoals, negotiate, and learn. Intelligent agents can play successfully this role andsoft computing has a significant impact in this technology (Loia and Sessa 2001).Of special importance is the use of intelligent agents as natural integrator of thedifferent soft computing techniques. It is possible that intelligent agents become theuniversal carrier of computational intelligence. The implementation of soft com-puting systems based on intelligent agents can satisfy most of the future industrialneeds with critical role for predictive supply-chain, intelligent security, and solvingthe curse of decentralization.

5.4 Solutions with Evolved Structure

One of the key differences between the existing industrial applications and thefuture needs is the transition from models and rules with fixed structure to so-lutions with evolved structure. Recently, evolving Takagi-Sugeno fuzzy systemshave demonstrated impressive performance on several benchmarks and applications(Angelov and Filev, 2004). They are based on incremental recursive clustering foron-line training of fuzzy partitions with possibility of adjoining new or removingold fuzzy sets or rules whenever there is a change in the input-output data pattern.If successful, this approach can reduce significantly the maintenance cost of manyapplied models in manufacturing.

5.5 Swarm Intelligence-Based Control

One potential approach that could address the need for continuous tracking theeconomic optimum and could revolutionize process control is the swarm-basedcontrol (Conradie et al. 2002). It combines the design and controller developmentfunctions into a single coherent step through the use of evolutionary reinforcementlearning. In several case studies, described in (Conradie et al. 2002) and (Conradieand Aldrich 2006), the designed swarm of neuro-controllers finds and tracks con-tinuously the economic optimum, while avoiding the unstable regions. The profitgain in case of a bioreactor control is > 30% relative to classical optimal control(Conradie and Aldrich 2006).

Page 414: Forging New Frontiers: Fuzzy Pioneers I

412 A. Kordon

6 Conclusion

The chapter summarizes the current state of the art of applying soft computingsolutions in the chemical industry, based on the experience of The Dow ChemicalCompany and projects the future trends in the field, based on the expected futureindustrial needs. Several examples of successful industrial applications of differentsoft computing techniques are given: automated operating discipline, based on fuzzylogic; empirical emulators of fundamental models, based on neural networks; ac-celerated new product development, based on genetic programming, and advancedprocess optimization, based on swarm intelligence. The impact of these applicationsis a significant improvement in process operation and in much faster modeling ofnew products.

The expected future needs of industry are defined as: predictive marketing, ac-celerated new products diffusion, high-throughput modeling, manufacturing at eco-nomic optimum, predictive optimal supply-chain, intelligent security, reduce virtualbureaucracy, emerging simplicity, and handling the curse of decentralization. Thefuture trends that are expected from the soft computing technologies, which maysatisfy these needs, are as follows: perception-based modeling, integrated systems,universal role of intelligent agents, models/rules with evolved structure, and swarmintelligence-based process control.

Acknowledgments The author would like to acknowledge the contribution in the discussed indus-trial applications of the following researchers from The Dow Chemical Company: Flor Castillo,Elsa Jordaan, Guido Smits, Alex Kalos, Leo Chiang, and Kip Mercure, and Mark Kotanchek fromEvolved Analytics.

References

Angelov P, Filev D (2004) Flexible models with evolving structure. International Journal of Intel-ligent Systems 19:327–340

Barker J (1989) Discovering the future: the business of paradigms. ILI Press, St. PaulBonabeau. E (2002) Agent-based modeling: methods and techniques for simulating human sys-

tems. PNAS, vol.99, suppl.3, pp. 7280–7287Breyfogle III F (2003) Implementing Six Sigma, 2nd. Wiley, HobokenCarlsson B, Jacobsson S, Holmen M, Rickne A (2002) Innovation systems: analytical and method-

ological issues. Research Policy 31:233–245Castillo F, Kordon A, Sweeney J, Zirk W (2004) Using genetic programming in industrial statistical

model building. In: O’Raily U. , Yu T., Riolo, R. and Worzel, B. (eds): Genetic programmingtheory and practice II. Springer, New York, pp. 31–48

Carlsson B, Jacobsson S, Holmen M, Rickne A (2002) Innovation systems: analytical and method-ological issues. Research Policy 31:233–245

Cawse J (Editor) (2003) Experimental design for combinatorial and high-throughput materialsdevelopment. Wiley, Honoken

Conradie A, Mikkulainen R, Aldrich C (2002) Adaptive control utilising neural swarming. In Pro-ceedings of GECCO’2002, New York, pp. 60–67.

Conradie A, AldrichC (2006) Development of neurocontrollers with evolutionary reinforcementlearning. Computers and Chemical Engineering 30:1–17

Page 415: Forging New Frontiers: Fuzzy Pioneers I

Soft Computing in the Chemical Industry: Current State of the Art and Future Trends 413

George M, Works J, Watson-Hemphill K (2005) Fast innovation. McGraw Hill, New YorkGerules M, Kalos A, Katti S (1992) Artificial neural network application development: a knowl-

edge engineering approach. In 132A, AIChE 1992 annual fall meeting, MiamiHerrera F (2005) Genetic fuzzy systems: status, critical considerations and future directions. Inter-

national Journal of Computational Intelligence 1:59–67Jain J, Martin N (eds), (1999) Fusion of neural networks, fuzzy sets, and genetic algorithms: in-

dustrial applications. CRC Press, Boca RatonKalos A (1992) Knowledge engineering methodology for industrial applications. In EXPERSYS-

92, ITT-InternationalKalos A, Rey T (1995) Application of knowledge based systems for experimental design selection.

In EXPERSYS-95, ITT-InternationalKarray G, De Silva C (2004) Soft computing and intelligent systems design: theory, tools and

applications. Addison Wesley, HarlowKatare S, Kalos A, West D (2004) A hybrid swarm optimizer for efficient parameter estimation. In

Proceedings of CEC’2004, Portland, pp. 309–315Kordon A (2004) Hybrid intelligent systems for industrial data analysis. International Journal of

Intelligent Systems 19:367–383Kordon A (2005) Application issues of industrial soft computing systems. In Proc. of

NAFIPS’2005, Ann ArborKordon A, Kalos A, Smits G (2001) Real time hybrid intelligent systems for automating oper-

ating discipline in manufacturing. In the Artificial Intelligence in Manufacturing WorkshopProceedings of the 17th International Joint Conference on Artificial Intelligence IJCAI-2001,pp. 81–87

Kordon A, Pham H, Bosnyak C, Kotanchek M, Smits G (2002) Accelerating industrial fundamentalmodel building with symbolic regression: a case study with structure – property relationships.In Proceedings of GECCO’2002, New York, volume Evolutionary Computation in Industry,pp. 111–116

Kordon A, Kalos A, Adams B (2003a) Empirical emulators for process monitoring and optimiza-tion. In Proceedings of the IEEE 11th Conference on Control and Automation MED’2003,Rhodes, pp.111

Kordon A, Smits G, Kalos A, Jordaan (2003b) Robust soft sensor development using ge-netic programming. In Nature-Inspired Methods in Chemometrics (R. Leardi-Editor) Elsevier,Amsterdam

Kordon A, Castillo F, Smits G, Kotanchek M (2006) Application issues of genetic programming inindustry. In Genetic Programming Theory and Practice III, T. Yu, R. Riolo and B. Worzel (eds)Springer, NY pp. 241–258

Kotanchek M, Kordon A, Smits G, Castillo F, Pell R, Seasholtz M.B, Chiang L,. Margl P, Mer-cure P.K., Kalos A (2002) Evolutionary computing in Dow Chemical. In Proceedings ofGECCO’2002, New York, volume Evolutionary Computation in Industry, pp. 101–110

Koza J (1992) Genetic programming: on the programming of computers by means of natural se-lection. MIT Press, Cambridge

Li Y, Ang K, Chong G (2006) Patents, software, and hardware for PID control. IEEE ControlSystems Magazine 26:42–54

Loia V, Sessa S (eds) (2001) Soft computing agents. Physica-Verlag, HeidelbergMahajan V, Muller E, Wind Y (eds) (2000) New product diffusion models. Kluwer, BostonRosca N, Smits G, Wessels J (2005) Identifying a pareto front with particle swarms. unpublishedSaid B, Bouron T (2001) Multi-agent simulation of a virtual consumer population in a competitive

market. in Proc. of 7th Scandinavian Conference on Artificial Intelligence, pp. 31–43Schwartz D(2000) Concurrent marketing analysis: a multi-agent model for product, price, place,

and promotion. Marketing Intelligence & Planning 18:24–29Shapiro J (2001) Modeling the supply chain. Duxbury, Pacific GroveSkolicki Z, Houck M, Arciszewski T (2005) Improving the security of water distribution systems

using a co-evolutionary approach. in the Critical Infrastructure Protection Session, “WorkingTogether: Research & Development (R&D) Partnerships in Homeland Security,” Departmentof Homeland Security Conference, Boston

Page 416: Forging New Frontiers: Fuzzy Pioneers I

414 A. Kordon

Smits G, Kotanchek M (2004) Pareto front exploitation in symbolic regression. In Genetic Pro-gramming Theory and Practice II , U.M. O’Reilly, T. Yu, R. Riolo and B. Worzel (eds) Springer,NY, pp 283–300

Zadeh L (1999) From computing with numbers to computing with words—from manipulation ofmeasurements to manipulation of perceptions”, IEEE Transactions on Circuits and Systems45:105–119

Zadeh L (2005) Toward a generalized theory of uncertainty (gtu) – an outline. Information Sciences172:1–40

Zilouchian A, Jamshidi M (eds),(2001) Intelligent control systems using soft computing method-ologies. CRC Press, Boca Raton

Page 417: Forging New Frontiers: Fuzzy Pioneers I

Identifying Aggregation Weights of DecisionCriteria: Application of Fuzzy Systems to WoodProduct Manufacturing

Ozge Uncu, Eman Elghoneimy, William A. Gruver, Dilip B Kotakand Martin Fleetwood

1 Introduction

A rough mill converts lumber into components with prescribed dimensions. Themanufacturing process begins with lumber being transferred from the warehouse tothe rough mill where its length and width are determined by a scanner. Then, eachboard is cut longitudinally by the rip saw to produce strips of prescribed widths.Next, the strips are conveyed to another scanner where defects are detected. Finally,the chop saw cuts the strips into pieces with lengths based on the defect informa-tion and component requirements. These pieces are conveyed to pneumatic kickerswhich sort them into bins. The rough mill layout and process of a major Canadianwindow manufacturer is used throughout this study.

The rough mill layout and process, as described by Kotak, et al. [11], is shown inFig. 1. The operator receives an order in which due dates and required quantities ofcomponents with specific dimensions and qualities are listed. Since there is a limitednumber of sorting bins (which is much less than the number of components in theorder), the operator must select a subset of the order, called cut list, and assign it tothe kickers. After selecting the cut list, the loads of lumber (called jags) that will beused to produce the components in the cut list must be selected by the operator. Theoperator also determines the arbor configuration and priority of the rip saw. Methodsfor kicker assignment have been reported by Siu, et al. [17] and jag selection hasbeen investigated by Wang, et al. [22], [23]. A discrete event simulation model wasdeveloped by Kotak, et al. [11] to simulate the daily operation of a rough mill forspecified jags, cut lists, ripsaw configuration, and orders.

Wang, et al. [22] proposed a two-step method to select the most suitable jag fora given cut list. The first step selects the best jag type based on the proximity ofthe length distribution of the cut list and the historical length distribution of the

Ozge Uncu · Eman Elghoneimy ·William A. GruverSchool of Engineering Science, Simon Fraser University, 8888 University Dr. Burnaby BC,Canada, e-mail: {ouncu, eelghone, gruver}@sfu.ca

Dilip B Kotak ·Martin FleetwoodInstitute for Fuel Cell Innovation, National Research Council, Vancouver, BC, Canada,e-mail: {dilip.kotak, martin.fleetwood}@nrc-cnrc.gc.ca

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 415C© Springer 2007

Page 418: Forging New Frontiers: Fuzzy Pioneers I

416 O. Uncu et al.

Fig. 1 Rough mill layout and process

components cut by using the corresponding jag type. The second step selects themost suitable jag in the inventory with the selected jag type. The following two de-cision criteria are used to rank the jags: width code of the jag and the board footageof the jag. In this model, the priorities of these two criteria are supplied a prioriby the user and maintained the same, even under different cut list characteristics.Wang, et al. [23] extended this method to cover a number of decision criteria whileselecting the suitable jag type. In their method, the priorities of the decision criteriamust be identified in order to select the most suitable jag type for a given cut list.This was achieved by using a Mamdani Type 1 fuzzy rulebase, Fuzzy AHP [4] andFuzzy TOPSIS [5]. The fuzzy rulebase has cut list characteristics as antecedent andpair-wise comparison of the importance of decision criteria as the consequent.

In the study presented in this paper, the method of Wang, et al. [23] is simplifiedso that conventional AHP and TOPSIS methods are utilized, thereby avoiding thecomputationally expensive fuzzy arithmetic and irregularities associated with alge-braic manipulation of triangular fuzzy numbers. The simplified model retained thefuzzy rulebase in order to capture the uncertainty due to different expert opinionson the importance of decision criteria for jag type selection. Then, the simplifiedmodel is further modified and extended to include decision criteria for determiningthe most suitable jag for the selected jag type. The modified and extended modeldoes not require expert input to obtain pair-wise comparisons of the importanceof decision criteria. Our approach eliminates the need for expert input which maylead to unsatisfactory results as discussed in Sect. 4. A fuzzy rulebase is used todetermine the weights of the decision criteria in selecting jag types and jags underdifferent cut list characteristics. This rulebase is tuned by using genetic algorithms(GA) [8]. Since the order to be fulfilled is constant, an objective function to beminimized is the cost associated with the selected raw material and the processingtime to complete the order.

The remainder of the paper is structured as follows. Background on multicrite-ria decision making methods and the jag selection approach used in the proposed

Page 419: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 417

methods will be provided in Sect. 2. Then, the benchmark method and two methodswhich find the most suitable weights to lower the overall cost of jag sequence willbe explained in Sect. 3. The proposed approaches have been evaluated on four testorder files for which results are provided in Sect. 4. Conclusions based on the resultsand potential extensions are discussed in Sect. 5.

2 Background

Since the proposed methods use the jag selection approach of Wang, et al. [22], themethod will be summarized. In addition, we provide a review of MCDM methodsand their associated disadvantages.

2.1 Multi-Criteria Decision Making Methods

Multi-Criteria Decision Making (MCDM) involves the evaluation of alternativeson the basis of a number of decision criteria. When the alternatives and decisioncriteria are given, there are two questions to be answered: How are the weightsof the decision criteria determined and how are the alternatives associated with thescores ranked? The earliest MCDM methods, which treated the second question, canbe considered as a weighted sum model (WSM) [18] and a weighted product model(WPM) [18]. WPM and WSM provide a basis to calculate scores of the alternativesto rank them. The following two equations are used to choose the best alternativewith respect to the associated scores calculated by WSM and WPM, respectively:

Ak∗ =⎧⎨⎩Ak

∣∣∣∣∣∣m∑

j=1

akj w j = maxi=1,...,n

⎛⎝

m∑j=1

ai j w j

⎞⎠⎫⎬⎭ (1)

Ak∗ =⎧⎨⎩Ak

∣∣∣∣∣∣m∏

j=1

(akj )w j = max

i=1,...,n

⎛⎝

m∏j=1

(ai j )w j

⎞⎠⎫⎬⎭ (2)

where w j is the weight of the j th criterion, ai j is the value of the i th alternative interms of the j th criteria, Ak∗ is the best alternative, and Ak is the kth alternative. Amajor limitation of WSM is that it requires the criteria to have the same scale so thatthe addition of scores is meaningful. WPM uses ratios to avoid scale issues. Sincethe Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) [9]and Analytic Hierarchy Process (AHP) [15] will be used in Sect. 3, these methodswill now be explained.

For a given set of weights and alternatives TOPSIS involves choosing an alter-native that has the shortest distance to the positive ideal solution and the longestdistance to the negative ideal solution. The set of criteria is divided into two subsets,

Page 420: Forging New Frontiers: Fuzzy Pioneers I

418 O. Uncu et al.

one to be minimized and the other to be maximized. Let the set of criteria to be min-imized and maximized be denoted as Smin and Smax , respectively, and let d(x, y)

denote the Euclidean distance between vectors x and y. Then, the positive idealsolution s+ = (s+1 , . . . , s+m ) and the negative ideal solution s− = (s−1 , . . . , s−m ) aredefined as follows:

s+j =⎧⎨⎩

maxi=1,...,n

(ai j w j ) if j ∈ Smax

mini=1,...,n

(ai j w j ) else, s−j =

⎧⎨⎩

mini=1,...,n

(ai j w j ) if j ∈ Smax

maxi=1,...,n

(ai j w j ) else

(3)

Let the score vector of the i th alternative be denoted by SCVi = (sci1, . . . , scim ),where sci j = ai j w j is the score of the i th alternative in terms of j th criterion. Then,the best alternative is selected as follows:

Ak∗ ={

Ak

∣∣∣∣d(SCVk, s−)

d(SCVk, s−)+ d(SCVk, s+)= max

i=1,...,n

(d(SCVi , s−)

d(SCVi , s−)+ d(SCVi , s+)

)}

(4)

The Analytic Hierarchy Process (AHP) [15] decomposes the solution of a MCDMproblem into the following three steps. The first step models the problem in a hierar-chical structure showing the goal, objectives (criteria), sub-objectives, and alterna-tives. The second step identifies the weights of each decision criterion. The domainexperts compare the importance of decision criteria in pairs to create a pair-wisecomparison matrix. In most of the methods derived from AHP, the criteria are as-sumed to be in ratio scale. The pair-wise comparison matrix has the following form:

PC M =

⎡⎢⎢⎣

c11 c12 . . . c1m

c21 · ci j

· c j i ·cm1 cm2 . . . cmm

⎤⎥⎥⎦ , where ci j = 1/c j i and cii = 1 (5)

In the original AHP method, the weight vector was calculated by determining theeigenvector associated with the largest eigenvalue of PCM. If the PCM is a perfectlyconsistent comparison, then the rank of PCM is equal to 1 and the only eigenvalueof PCM is equal to m, the number of decision criteria. Thus, in order to validate theresults, a consistency measure is defined with respect to the difference between thelargest eigenvalue of PCM and m. The consistency index is defined by Saaty [15] asfollows:

C I = λmax − m

m − 1(6)

Then, a random index RI is calculated as the average of the consistency mea-sures for many thousands randomly generated PCM matrices, in the form explainedin (5). The ratio of CI and RI, called the consistency ratio, is used as the consis-tency measure. A rule of thumb is to consider the weight vector to be reliable if the

Page 421: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 419

consistency ratio is less than 0.1. Then, the alternatives can be scored, ranked andthe best alternative can be selected by using a ranking method such as WSM, WPM,or TOPSIS. The major criticisms on the original AHP method are as follows: i) Therate of growth in the number of pair-wise comparisons in the method is quadratic, ii)it does not deal with interdependencies between decision criteria, iii) it is sensitive tothe pair-wise comparisons and measurement scales, iv) whenever another alternativeis added, the ranks of alternatives can be totally changed (rank reversal), and, mostimportantly, v) experts are asked to provide crisp numbers while building the PCMmatrix. Several studies have been published to address these issues. Triantaphyllou[19] proposed a method in which the relative performance of two decision criteriais examined within a given alternative instead of comparing two alternatives at atime. To address the rank reversal issue, Belton and Gear [1] proposed to divideeach relative value by the maximum value in the corresponding vector of relativevalues instead of having the relative values sum to one. Sensitivity to the selectionof the measurement scale was studied by Ishizaka, et al. [10].

The last issue has been treated by several researchers using fuzzy set theory. VanLaarhoven and Pedrycz [21] used triangular membership functions and the leastsquare method to extend the AHP method to the fuzzy domain. Buckley [3] criti-cized the use of triangular membership functions and used trapezoidal to modify thefuzzy AHP method of Van Laarhoven and Pedrycz. These methods still have unreli-able results due to rank reversals and require considerable amount of computations.Deng [7] proposed a method that used α-cut and fuzzy extent analysis to reduce thecomputations in the pair-wise comparison matrix represented by triangular member-ship functions. Mikhailov [13] cited the criticism of Saaty [16] on the use of extentanalysis and further investigated the disadvantages of utilizing the comparison ma-trix constructed by fuzzy judgments represented by the triangular membership andtheir reciprocals. He proposed to use fuzzy linear and nonlinear programming withthe reciprocals of the triangular membership functions to determine crisp weights.Fuzzy extensions of fuzzy WSM, fuzzy WPM, and fuzzy TOPSIS have also beenstudied [18].

2.2 Jag Selection

Jag selection requires finding the most suitable jag (among available jags) withproper capabilities to produce a given cut list” [22]. Wang, et al. [22] utilized amethod of case-based reasoning, called CBR-JagSelection, where jag types are de-fined by mill, grade, production line, length code, and thickness. A distribution ofthe percentage of cut quantities in each predetermined sort length bin is calculatedfor each jag type based on the average of historical cut quantities achieved by usingjags that belong to the corresponding jag type. The CBR-JagSelection algorithmcan be summarized as follows: For a given cut list, the distribution of the percentageof required quantities for the components in the cut list is created. The differencebetween the quantity-length distribution of the cut list and the historical distributionof each jag type is calculated by using the differences between percentages in the

Page 422: Forging New Frontiers: Fuzzy Pioneers I

420 O. Uncu et al.

two distributions and the values associated with each length bin. This difference,which can be considered as a penalty measure, is used to sort the jag types. By thismeans, the penalty measure is a measure of suitability of the jag type for the givencut list.

Starting from the most suitable jag type, the available jags in the inventory areselected for each jag type until a maximum number of jags has been reached. Thelist of jags for each jag type is sorted by using criteria such as the proximity of theboard footages (a measure of volume used in lumber industry) of the jag and the cutlist, and the suitability of the width of the selected jag for the given ripsaw arborconfiguration and priority. For each alternative, individual scores corresponding toeach decision criterion are calculated. Then the weighted average of the scores isused to rank the jags. The weights are specified by the user at the beginning of therun and are not changed until the order is finished. The sorted list of jags is presentedto the user to select the jag [22]. The weights associated with the proximity of boardfootages and suitability of the width of the selected jag will be denoted by wBDTand wWDT, respectively. Users of the jag selection method have reported that it isdifficult to maintain wBDT and wWDT fixed during the production run with in the jagselection method because of the fact that the shop floor managers want the weightsto change dynamically depending on the characteristics of the cut lists. This issuewill be address in Sect. 3.4.

The algorithm of Wang, et al. [22] generates a sorted list of alternative jags for agiven cut list. Suppose we would like to generate a list of suitable jags for a cut listwith homogeneous width and constant thickness. Let the cut list consist of the setof pairs of (li , qi ), where li and qi are the length and the required quantity of the i th

component, respectively. Furthermore, let NCO be the number of components in thecut list and suppose that the components in the cut list are arranged in descendingorder with respect to their length. Then, the cumulative quantity percentage distri-bution of the cut list can be computed as follows:

C P Dm =

∑i=1i≤m

qi

NC∑i=1

qi

,∀m = 1, . . . , NC O (7)

Each jag type in the history file has a similar distribution that will be used tomeasure the degree of match between the jag type and the cut list. The distributiondefined on length axis is discretized into 75 bins in the interval of [50mm, 3750mm]with 50mm steps. The degree of match between the cut list and the jag type distribu-tions is calculated. Then, the jag types are sorted according to their degree of match.Next, the jags available in the inventory are found for each jag type and sorted bypriority according to their board footage and width.

The procedure for determining the sorted list of jags is as follows. Let NJT bethe number of jag types, NCO be the number of components in the cut list, thecumulative percentage in the j th bucket of distribution of the kth jag type J Tk beC P D_J Tk, j , and the value of the i th component with length li be V (li ).

Page 423: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 421

Step 0. Provide wB DT and wW DT

Step 1 For all jag type, ∀k = 1, . . . , N J T DOStep 1.1 Initialize extra,E and cumulative penalty Pk , with 0Step 1.2 Initialize flagFeasible with 1Step 1.3 For all component in the cutlist, ∀i = i, . . . , NC O DOStep 1.3.1 Find the bucket in the jag type distribution in which li residesStep 1.3.2 D is defined as C P D_J Tk, j + E-C P D j

Step 1.3.3 IF D < 0 THEN set flagFeasible to 0 and E to 0, ELSE E = DStep 1.3.4 Update the penalty by Pk = Pk + |D| × V (li )

Step 1.4 IF flagFeasible equals 0 THEN jag type is categorized as infeasible,ELSE it is categorized as feasible

Step 2. Sort the feasible jag types in ascending order with respect to penalties.Step 3. Sort the infeasible jag types in ascending order with respect to penalties.Step 4. Merge the two sorted feasible and infeasible jag type lists and select

best MAXJAGTYPE ones.Step 5. For each of the remaining jag type in the sorted jag type list DOStep 5.1 Get the available jags from the inventoryStep 5.2 Sort the available jags in descending order with respect to their priority

calculated by using the jag width and the match between the boardfootage of the jag and the cut list and the weights (wB DT and wW DT )specified by the user at the beginning of the run.

Algorithm 1. Jag Ranking (Jag List Creation)

3 Jag Sequencing Approaches

As mentioned in Sect. 2, the jag selection algorithm of Wang, et al., determines themost suitable jag for a given cut list. Because the entire production run of an orderis not considered there is no measure that can be directly associated with businessgoals (such as minimizing raw material cost, minimizing production time, maximiz-ing profit) to investigate whether the selected jags provide acceptable results. Solu-tion of the jag sequencing problem requires determining the most suitable sequenceof jags to produce all components in the order while minimizing raw material costand production time.

Thus, the objective is to minimize the costs associated with the jag sequence. LetR(Jagi ) be the raw material cost of one board foot of lumber with grade equal tothe grade of the i th jag in the sequence, B F(Jagi ) be the board footage of the i th

jag, PTC be the opportunity cost incurred for an extra second of production, andT (Jagi ) be the time required to cut components out of i th jag in the sequence. Then,the evaluation function for j th jag sequence can be represented:

Page 424: Forging New Frontiers: Fuzzy Pioneers I

422 O. Uncu et al.

E(J Sj ) =N J j∑i=1

(R(Jagi)B F(Jagi)+ T (Jagi)PT C) (8)

where N Jj is the number of jags in the j th jag sequence.There is also a time constraint to solve this problem. A 15 minute break in the

production must occur every two hours. Thus, the most suitable jag sequence shouldbe determined within 15 minutes.

3.1 Zero Intelligence Jag Sequencing

In order to have a benchmark to evaluate the proposed method to optimize theweights of the decision criteria used in jag selection, a method, with no intelligence,is utilized to output the jag sequence. In this method, the jags are selected repeat-edly by using the jag selection approach of Algorithm 1 until the order is fulfilled.Thus, the selected jags are optimized with respect to the selection criteria (with fixedweights) for the current cut list and the overall objective is ignored.

3.2 Fuzzy Rulebase Based Analytic Hierarchy Process

Wang, et al. [23] extended their jag selection method to consider factors other thanthe penalty due to the difference between length-quantity distribution of the cut listand that associated with the jag types.

The chopsaw yield, average processing time, the above mentioned penalty, andthe standard material cost associated with the jag types were used to rank the jagtypes from more desirable to less desirable. Their method determines the weightsof these new decision criteria and ranks the jag types by using these weights andthe value of the alternatives with respect to the decision criteria. Although AHP is apopular MCDM method to find the weights of decision criteria, it requires expertsto make pair-wise comparisons of the importance of decision criteria. However, it isdifficult for a decision maker to precisely quantify and process linguistic statementssuch as “How important criterion A is compared to criterion B.” Fuzzy set theory[24] provides a basis for treating such problems.

Wang, et al., adopted a fuzzy AHP method [4] to determine the weights. Trape-zoidal membership functions were used to represent the pair-wise comparisons.Since these comparisons change depending on the characteristics of the cut list, afuzzy rulebase, with cut list characteristics as antecedent and pair-wise comparisonsas consequents, is constructed through interviews with domain experts. There areuncertainties due to different expert opinion and uncertainties due to imprecisionin historical data used in the jag selection process. The antecedent variables of therulebase are the percentage of long sizes, due date characteristics and cumulativerequired quantity of the cut list. The consequents of the rulebase are the pair-wise

Page 425: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 423

comparisons of the importance of decision criteria, namely chopsaw yield, averageprocessing time, penalty (as described in Sect. 2.3), and standard material cost ofjag type.

Let x1 be the percentage of long sizes in the current cut list file, li and qi denotethe length and required quantity of the i th component in the order file, respectively,NCO be the number of components in the order file, and LONG be the numberof components in the order file with length greater than 2000mm. Similarly, let x2be the due date attribute of the current cut list file, di be the due date of the i th

component in the cut list file, and the due date priority of any component is 1, 2,3, or 4 where 1 corresponds to the most urgent. Furthermore, let x3 be the scaledcumulative quantity of the current cut list file. Then, the antecedent variables of therulebase in Fig. 2 can be defined as follows in (9):

x1 = λ

∑i∈Long

l2i qi

NC O∑j=1

l2j q j

, x2 =

NC O∑i=1

qi ,

NC O∑i=1

qi di

, x3 =

NC O∑i=1

qi

γ(9)

where λ and γ are set to 2 and 20000, respectively. These constants are used tonormalize the antecedent variables.

The rulebase to capture the pair-wise comparisons is shown in Fig. 2. Theacronyms V, E, M, D, VD, L, N, U, S, L, ELS, CLS, SLS, ES, SMS, CMS, andEMS represent the linguistic labels Very Easy, Easy, Medium, Difficult, Very Diffi-cult, Loose, Normal, Urgent, Small, Medium, Large, Extremely Less Significant,

Fig. 2a General rulebase structure to capture fuzzy pair-wise comparisons based on cut list char-acteristics

Page 426: Forging New Frontiers: Fuzzy Pioneers I

424 O. Uncu et al.

Fig. 2b Sample rule

Considerably Less Significant, Somewhat Less Significant, Equal Significance,Somewhat More Significant, Considerably More Significant, and Extremely MoreSignificant, respectively. Then, the membership functions of the fuzzy sets used inthe rulebase are defined in Table 1. The rulebase contains 45 rules corresponding toall possible combinations of fuzzy sets that represent antecedent variables.

In the method proposed by Wang, et al. [23], for a given cut list x1, x2 and x3 arecalculated and fuzzified by using the rulebase shown in Fig. 2. Let μi (x j ) denotethe membership value of j th antecedent variable in i th rule. Then, the antecedentmembership values are aggregated to find the degree of fire in each rule. Zadeh’sMIN operator is used to determine the output fuzzy sets in each rule as in Mamdanitype fuzzy rulebases [12]. Then, the output fuzzy sets are aggregated using Zadeh’sMAX operator. Finally, the output fuzzy set is defuzzified with the centre of gravitymethod.

After finding the defuzzified value of pair-wise comparisons, Wang, et al., changethe results in ordinal scale to ratio scale by using Table 2.

Wang, et al., use these results in ratio scale as the modes of triangular fuzzynumbers to construct a fuzzy pair-wise comparison matrix. The left and right spreadvalues are determined as in explained in [23]. Then, the fuzzy weights are identified

Table 1a Fuzzy sets and membership functions of the antecedent variables

LinguisticTerm

Membershipfunction

VE (0,0,0.02,0.04)E (0.02,0.04,0.16,0.2)

Length M (0.16,0.2,0.4,0.5)D (0.4,0.5,0.9, 0.95)VD (0.9,0.95,1,1)L (0,0,0.33,0.4)

Due date N (0.33,0.4,0.8,0.9)U (0.8,0.9,1,1)S (0,0,0.2,0.25)

Quantity M (0.2,0.25,0.4,0.45)L (0.4,0.45, 1,1)

Page 427: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 425

Table 1b Fuzzy sets and membership functions of the scales used in pair-wise comparisons inconsequent variables

ScaleLinguisticTerm

Membership function

ELS (0,0,0.1,0.15)CLS (0.1,0.15,0.25,0.3)SLS (0.25,0.3,0.4,0.45)ES (0.4,0.45,0.55,0.6)SMS (0.55,0.6,0.7,0.75)CMS (0.7,0.75,0.85,0.9)EMS (0.85,0.9,1,1)

by using Fuzzy Row Means of Normalized Columns methods [4]. After identifyingthe weights, the Fuzzy TOPSIS method proposed [5] is used to rank the alternatives.Please refer to [23] for technical details.

Thus, the model has three stages: i) Identify the pair-wise comparisons dependingon the cut list characteristics, ii) identify weights by processing the result of the firststage with fuzzy AHP, and iii) rank the jag types by using Fuzzy TOPSIS withthe weights identified in the second stage. This model provides decision support toselect a jag for a given cut list. The first stage model is extremely useful becausethe rules are determined by expert users through interviews, i.e., the rulebase whichimplies the pair-wise comparisons between the importance of decision criteria isdecided by domain experts. Thus, the use of fuzzy sets (linguistic labels) offersa valuable mechanism to capture the uncertainty of the linguistic labels due todifferent expert opinions under different cut list characteristics. The second stageidentifies fuzzy weights associated with the criteria that affect the jag type selec-tion. The use of fuzzy sets is beneficial if the weights are to be presented to theend user either to give linguistic feedback or to obtain approval of the experts.However, the jag sequencing problem is different than selection a jag for a givencut list. In this study, the goal is to find the most suitable weights to obtain a betterjag sequence with respect to the objective function in (8). The recommendationsupplied by the decision support system is not the weights, but instead it is the jagsequence itself. Hence, the use of fuzzy logic is no longer necessary in the secondstage. Thus, in this study, the method proposed by Wang, et al. [23] is simplified.In the simplified model, the fuzzy pair-wise comparisons are obtained by firing thefuzzy rulebase with crisp cut list attribute values. Then, the output fuzzy sets aredefuzzified to obtain a crisp comparison matrix for the traditional AHP method, andthe traditional TOPSIS method is utilized to rank the alternatives. This is advanta-geous for the following three reasons: i) imprecision due to expert opinion is stillcaptured in the first stage, ii) computationally expensive fuzzy arithmetic operations

Table 2 Correspondence of ordinal to ratio scale

Scale Pair-wise Comparison Value

Ordinal 01 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Ratio 1/9 1/7 1/5 1/3 1 3 5 7 9

Page 428: Forging New Frontiers: Fuzzy Pioneers I

426 O. Uncu et al.

are avoided, iii) irregularities due to fuzzy arithmetic on triangular fuzzy numbersare avoided. Instead of Fuzzy AHP, this study utilizes conventional AHP with fuzzypair-wise comparisons. The weights are determined by the Row Means of Normal-ized Columns method. The jag types with scores calculated by using these weightsare ranked by using the conventional TOPSIS method as explained in Sect. 2.

3.3 Weight Determination Using Fuzzy Rulebase Tuned by GA

In the previous subsection, we proposed a simplified method to find the most suit-able jag sequence by determining the most suitable weights associated with the deci-sion criteria used to rank the jag types. This goal is achieved with the AHP method.The pair-wise comparisons of importance of decision criteria under different cut listcharacteristics, made by domain experts, were used to determine these weights. Theuncertainty in these comparisons for different cut list conditions is captured by usingfuzzy rulebases.

However, is it necessary to use methods that require human input such as AHP todetermine the weights? Since the simulation model proposed by Kotak, et al. [11]can quickly evaluate what-if scenarios, we utilize a metaheuristic search mecha-nism to identify the most suitable weights associated with the decision criteria with-out pair-wise comparisons acquired from experts. Thus, a Type 1 Mizumoto fuzzyrulebase model [14] is proposed to imply the crisp weights under different cut listconditions.

In this study, the scope of application of fuzzy models is extended. We proposethe use of two types of rulebases: (i) a rulebase for the weights to rank the jag types,and (ii) a rulebase for the weights to rank the jags. In rulebase (i), the antecedentvariables are similar to those in the rulebase of Wang, et al. [23], i.e., size distri-bution, due date characteristics, and cumulative required quantity of the order file.Similarly, the consequent variables are the weights of average yield of the jag type,standard material cost of the grade associated with the jag type, and the penaltydue to the difference between size distributions associated with the jag type and cutlist. The processing time of the jag type and the standard material cost is highlycorrelated and determined by the grade of the jag types. For this reason, the weightassociated with the processing time of the jag type is eliminated from the model.Thus, the input variables of the fuzzy rulebase used to select the jag type is definedas in (9).

After selecting the most suitable jag type, the jags of the selected jag type willbe ranked by using the weights identified by rulebase (ii) in which the antecedentvariables are the percentage of long components in the cut list, board footage ratiobetween jag and cut list, and the scaled difference between the longest and secondlongest piece in the current cut list. The consequent variables are the weights associ-ated with the board footage and the width code of the jag. The first input variable isdefined as in (9). The remaining two input variables of the second type of the fuzzyrulebase are defined as follows:

Page 429: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 427

x4 =

maxi=1,...,N O

(li )− maxi=1,...,N O−

{max

i=1,...,N O(li )

}(li )

φ(10)

x5 =max

J ag j∈Selected J agSet(B F(Jag j ))

B F(cutli st)(11)

where φ is 3700, Jag j is the j th jag in the set of jags with the selected jag typechosen by using the weights identified by the rulebase (i), and BF(.) is the boardfootage of the argument.

Let NVk and N Rk be the number of cut list state variables and the total number ofrules for the kth decision criterion, respectively. It is assumed that the decision crite-ria are independent. Thus, a separate rulebase can be built for each criterion. Basedon this assumption, the proposed rulebase structure for the kth decision criterion isformally represented:

Rk :N Rk

ALSOi=1

(N Vk

ANDj=1

[x j ∈ X j isr Aij ]→ wki ∈ [0, 1]

),∀k = 1, . . . , NC (12)

where x j is the j th cut list state variable with domain X j , Aij is the Type 1 fuzzyset (linguistic value) of the j th cut list state variable in the i th rule with member-ship function μi (x j ) : X j → [0, 1], wki is the scalar weight of the kth decisioncriterion in i th rule. It should be noted that we have the following constraint on themembership functions:

N Rk∑i=1

μi (x j′) = 1,∀ j = 1, . . . , NVk ,

where x ′j is any crisp value in X j . A sample fuzzy rulebase to define the weight ofstandard material cost is given in Fig. 3.

To reduce the number of rules, the antecedent variables were assigned to onlytwo linguistic labels. Then, the consequent constants (crisp weights) are tuned byusing genetic algorithms [8]. Let L, H, LO, and U denote Low, High, Loose, andUrgent, respectively. Then, the fuzzy sets to represent linguistic labels associatedwith antecedent variables are given in Table 3.

Sugeno’s product, sum, and standard negation triplet will be used in place ofAND, OR, NEGATION, respectively. Since the Mizumoto Type 1 fuzzy rulebasestructure [14] is used, IMPLICATION is equivalent to scaling each consequentconstant with the corresponding degree of fire. The aggregation of consequents isachieved by taking the weighted average of the scaled consequents.

The consequent scalar weights of the above mentioned rulebases are tuned by us-ing the genetic algorithm toolbox of MATLAB R©. Since there are only two linguisticlabels for each antecedent variable and three variables in rulebases that are designedfor determining weights to rank jag types and jags, eight scalar values must be tuned

Page 430: Forging New Frontiers: Fuzzy Pioneers I

428 O. Uncu et al.

Fig. 3 Sample rulebase with cut list characteristics as antecedent and scalar weight as consequent

for each weight type. Thus, a total of forty values must be tuned for five differentweight types. Therefore, each chromosome in the population has forty genes, whereeach gene is constrained to have a value in unit interval. The chromosome structureis shown in Fig. 4.where wki is the scalar weight of the kth decision criterion in i th rule, k = 1, .., 5,and i = 1, . . . , 8. w1i , w2i , w3i , w4i , and w5i stands for the weight associated withthe board footage, the width code of the jag, the standard cost of the jag type, thechopsaw yield of the jag type, and the penalty due to mismatch in the distributionsassociated with the cut list and the jag type, respectively.

The population size is forty chromosomes. The initial population was createdrandomly from uniform distribution in the unit interval. The maximum executiontime was 15 minutes. The maximum number of generations allowed having the samebest solution before termination which was set to 10 iterations. The default valueswere used for the rest of the genetic algorithm parameters of the MATLAB R© GAtoolbox.

Table 3 Linguistic labels and membership functions of the antecedent variables

Linguistic Term Membership function

Percentage Long Components L (0,0,0.2,0.8)H (0.2,0.8,.1.1)

Due Date LO (0,0,0.2,0.8)U (0.2,0.8,.1.1)

Total Quantity L (0,0,0.2,0.8)H (0.2,0.8,.1.1)

BDRatio L (0,0,0.2,0.8)H (0.2,0.8,.1.1)

GapBetweenLong Components L (0,0,0.2,0.8)H (0.2,0.8,.1.1)

Page 431: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 429

W12 ... W18 ... Wij ... W51 ... W58

Fig. 4 Chromosome structure used in the genetic algorithm

4 Experimental Results

In order to test the proposed methods, four order files were selected from actualorders. Since the GA method is a randomized method, the method was executed with100 different seeds to observe whether it converged to the objective value. The threemethods explained in Sect. 3 were compared with respect to the objective functiondefined in (8). The value of a second of production PTC was set to $0.75/sec.

The quantity and due date distribution of these four orders are shown in Figs. 5and 6, respectively. The total quantity of components that needs to be produced inOrder 1, 2, 3, and 4 are 3415, 2410, 3312, and 3410, respectively. The cumulativelength (lineal meters) of the components ordered in Order 1, 2, 3 and 4 are 4338.8m,2609.8m, 3944.7m, 3652.1m. The gap between the longest and the second longestcomponent in Order 1, 2, 3, and 4 are 45 mm, 330 mm, 541 mm, and 500 mm, re-spectively. The percentage of long components (length greater than 2000 mm) inOrder 1, 2, 3 and 4 are 0%, 8.1%, 27.8%, and 0.2%, respectively. The board footageof orders 1, 2, 3 and 4 are 4191, 2521, 3811 and 2911, respectively. The first threeorders have components with 6/4 (or 40mm) thickness and 57mm width. The lastorder have components with 5/4 (or 33mm) thickness and 57mm width.

The results for Order 1 are summarized in Table 4. Although the three methodsused the same number of jags to produce the order, Zero Intelligence (ZI), AHP

0 1000 2000 3000 40000

200

400

600

800

Qua

ntity

Qua

ntity

Qua

ntity

Qua

ntity

Length(mm)

Order 1

0 1000 2000 3000 40000

200

400

600

Length(mm)

Order 2

0 1000 2000 3000 40000

500

1000

Length(mm)

Order 3

0 1000 2000 3000 40000

200

400

600

Length(mm)

Order 4

Fig. 5 Quantity distribution over length for four benchmark order files

Page 432: Forging New Frontiers: Fuzzy Pioneers I

430 O. Uncu et al.

Table 4 Results of three methods for Order 1Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost

1 BT S1 15.9 20 6/4 1552.582 KF S1 15.4 1 6/4 588.6453 KF S1 15 1 6/4 479.108

Zero Intelligence 4 PT S1 16 1 6/4 1456.37 $18,8585 KF CL 12.8 12 6/4 1384.936 PT S1 15.2 1 6/4 1356.367 PT S1 16 1 6/4 1421.13Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT S1 15.9 20 6/4 1552.582 KF S1 15.4 1 6/4 588.6453 KF S1 15 1 6/4 479.108

AHP - Fuzzy Rulebase 4 PT S1 16 1 6/4 1456.37 $18,6335 PT S1 15.7 1 6/4 1563.056 BT S1 8 20 6/4 771.5257 PT S1 15.2 1 6/4 1356.36Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT S1 15.9 20 6/4 1552.582 KF S1 15.4 1 6/4 588.6453 KF S1 15 1 6/4 479.108

GA – Fuzzy Rulebase 4 PT S1 16 1 6/4 1456.37 $18,4355 KF S1 12.9 1 6/4 1809.756 KF S1 12.6 1 6/4 625.7937 BT S1 13 20 6/4 1114.43

and GA methods used 8239, 7767, and 7626 board feet of lumber. Moreover, theZI method used a CL (Clear) grade jag which is a considerably better grade, hencemore expensive, than S1 (Shop 1) grade jags. The difference in the overall cost canbe attributed to these two observations. Thus, the ZI, AHP, and GA methods areranked as 3rd, 2nd, and 1st with respect to overall cost for Order 1.

1 2 3 40

1

2

3

4

DueDate

Order 1

1 2 3 40

2

4

6

8

DueDate

Order 2

1 2 3 40

5

10

15

20

DueDate

Order 3

1 2 3 40

10

20

30

Fre

quen

cyF

requ

ency

Fre

quen

cyF

requ

ency

DueDate

Order 4

Fig. 6 Due date distribution for four benchmark order files

Page 433: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 431

Table 5 Results of three methods for Order 2Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost

1 KF CL 12.8 12 6/4 1384.932 BT S1 12 20 6/4 1157.29

Zero Intelligence 3 YB S3 14.2 25 6/4 789.623 $8,898.404 KF S1 12.6 1 6/4 625.7935 KF S1 15 1 6/4 479.108Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT S1 15.9 20 6/4 1552.582 KF CL 13 6 6/4 420.053

AHP - Fuzzy Rulebase 3 KF S1 12.6 1 6/4 625.793 $10,7264 KF S1 15.4 1 6/4 588.6455 KF S1 15 1 6/4 479.1086 PT S1 16 20 6/4 1165.86Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 KF CL 12.8 12 6/4 1384.932 YB VG 14 25 6/4 1508.76

GA - Fuzzy Rulebase 3 YB S3 14.2 25 6/4 789.623 $7,924.404 KF S1 15 1 6/4 479.108

The results for Order 2 are summarized in Table 5. ZI, AHP and GA meth-ods used different number of jags and different overall board footages of lum-ber for Order 2. ZI, AHP, and GA methods used 4436, 4832, and 4162 boardfeet of lumber. Although the ZI method used a high board footage of CL (Clear)grade jag, it also used a S3 (Shop 3) grade jag which is a considerably lowergrade than S1 grade jags. The GA method used two jags with higher grades (CLand VG) first and then a lower grade (S3) and medium grade (S1) to completethe production. The difference in the consumed board footages and associatedgrades creates the main difference in the ranks of the methods. For Order 2, theZI, AHP, and GA methods are ranked as 2nd, 3rd, and 1st with respect to overallcost.

The results for Order 3 are summarized in Table 6. This time, ZI and AHP meth-ods used exactly the same jag sequence. Although the GA method used the samenumber of jags, the board footage used to complete order is 7665 vs 9020. The GAmethod used high grade lumber to produce the longer pieces in the order first, thenit used a considerably less amount of middle grade S1 jags to finish the rest of order.For Order 3, the GA method is the winner and ZI and AHP methods are tied withrespect to overall cost.

The results for Order 4 are summarized in Table 7. Although, the jag sequenceobtained by the GA method used a higher number of jags than the one acquiredby the AHP method, the GA method is slightly better than the AHP method withrespect to overall cost. The reason is that the GA method selected jags with totalboard footage less than that determined by the AHP method. The AHP methodyields a better jag sequence than the ZI method.

The cost differences for the jag sequences acquired by using the ZI, AHP andGA methods are shown in Fig. 7.

The percentage improvement in overall cost for three different methods in eachorder with respect to worst case in the corresponding order is given in Table 8.

Page 434: Forging New Frontiers: Fuzzy Pioneers I

432 O. Uncu et al.

Table 6 Results of three methods for Order 3Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost

1 BT CL 15.6 20 6/4 1566.862 BT S1 12 20 6/4 1157.29

Zero Intelligence 3 BT S1 15.9 20 6/4 1552.58 $19,7184 KF S1 18.2 1 6/4 1611.635 KF S1 15.4 1 6/4 588.6456 KF S1 19.8 1 6/4 2543.18Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT CL 15.6 20 6/4 1566.862 BT S1 12 20 6/4 1157.29

AHP- Fuzzy Rulebase 3 BT S1 15.9 20 6/4 1552.58 $19,7184 KF S1 18.2 1 6/4 1611.635 KF S1 15.4 1 6/4 588.6456 KF S1 19.8 1 6/4 2543.18Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 KF CL 18.6 12 6/4 2048.832 KF CL 12.8 12 6/4 1384.93

GA - Fuzzy Rulebase 3 KF S1 15.4 1 6/4 588.645 $15,2834 KF S1 15 1 6/4 479.1085 KF S1 18.2 1 6/4 1611.636 BT S1 15.9 20 6/4 1552.58

Table 7 Results of three methods for Order 4Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost

1 BT S1 13.8 4 5/4 1322.752 BT S1 10 4 5/4 1234.25

Zero Intelligence 3 BT S1 13.7 4 5/4 1312.17 $11,2014 BT S1 19.6 12 5/4 755.175 BT S1 11 12 5/4 267.4366 BT S1 11.6 6 5/4 784.037 BT S1 11.4 6 5/4 903.318Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT S1 13.8 4 5/4 1322.752 BT S1 10 4 5/4 1234.25

AHP - Fuzzy Rulebase 3 BT S1 13.7 4 5/4 1312.17 $10,2954 BT S1 11.4 6 5/4 903.3185 BT S1 11 12 5/4 267.4366 BT S1 11.6 6 5/4 784.03Jag# Mill Grde Leng Wdth Thck BdFt OverAll Cost1 BT S1 13.8 4 5/4 1322.752 BT S1 10 4 5/4 1234.25

GA - Fuzzy Rulebase 3 BT S1 13.7 4 5/4 1312.17 $10,0934 BT S1 11 12 5/4 267.4365 BT S1 12 25 5/4 346.326 BT S1 18.4 8 5/4 434.8247 BT S1 11.6 6 5/4 784.03

Page 435: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 433

0

5000

10000

15000

20000

25000

1 2 3 4

Order

Ove

rall

Co

st

Zero Intelligence

AHP - Fuzzy Rulebase

GA-Fuzzy Rulebase

Fig. 7 Cost Comparison of Methods for Different Orders

Table 8 Percentage improvement in overall cost for ZI, AHP and GA methods

Order MethodZero Intelligence AHP - Fuzzy Rulebase GA - Fuzzy Rulebase

1 0 1.193127585 2.243079862 17.12287899 0 26.123438373 0 0 22.492139164 0 8.088563521 9.891973931

It should be noted that the GA method gives several set of weights that yieldsjag sequences with the same overall cost, i.e., multiple solutions to the optimizationproblem. We checked whether different sets of weights were close to each other bytaking the averages of the weights and running the model with averaged weights.Since 100 executions for all order files yielded the same result, we concluded thatthe GA method converges to the same objective value.

5 Conclusions and Future Studies

This study extended the method proposed by Wang, et al. [23] to identify the ag-gregation weights of a multicriteria decision making problem of jag type and jagselection. However, rather than finding suitable jag types and hence jags for a givencut lists, the larger issue is to determine a suitable jag sequence for a given orderfile. Thus, the performance measure to evaluate the proposed methods is the overallcost (including raw material and processing time costs) to complete a given orderfor the jag sequence.

As our results have shown, the AHP method, which requires expert opinion forcreating a fuzzy rulebase, may yield worse results than the Zero Intelligence methodwith fixed weights. This result is likely due to inconsistent pair-wise comparisons.The GA method, which utilizes a fuzzy rulebase created randomly and tuned by agenetic algorithm, consistently yielded better results than the other two methods.

The major result of this study is to utilize Type 1 fuzzy rulebases to capture theuncertainty due to selection of learning parameters of genetic algorithms. This studycan extended to Type 2 by using an approach proposed by Uncu [20] involving theselection of values of learning parameters in the Fuzzy C-Means algorithm [2]. In

Page 436: Forging New Frontiers: Fuzzy Pioneers I

434 O. Uncu et al.

order to explain the rationale, consider the following analogy. In grey-box systemmodeling techniques, different opinions are a source of uncertainty. The learningalgorithm or different values of learning parameters of the structure identificationalgorithm can be treated as different experts. In this study, Type 1 fuzzy systemmodels are identified by using different values of the learning parameters of thestructure identification method. Thus, the genetic algorithm will be executed withdifferent values of its learning parameters. The uncertainty due to the selection ofthe learning parameters could be treated as the source of Type 2 fuzziness.

References

1. Belton V., Gear T. (1983) On a shortcoming of Saaty’s method of analytic hierarchies. Omega11: 228–230

2. Bezdek J.C. (1973) Fuzzy Mathematics in Pattern Classification, Ph.D. Thesis, Cornell Uni-versity, Ithaca

3. Buckley J.J. (1985) Fuzzy hierarchical analysis. Fuzzy Sets and Systems 17: 233–2474. Chang P.T., Lee E.S. (1995) The estimation of normalized fuzzy weights. Computers & Math-

ematics with Applications 29: 21–425. Chen S. J., Hwang C. L. (1992) Fuzzy multiple attribute decision making. Lecture Notes in

Economics and Mathematical Systems 3756. Chu T.C., Lin Y.C. (2003) A fuzzy TOPSIS method for robot selection. International Journal

of Advanced Manufacturing Technology 21:284–2907. Deng H. (1999) Multicriteria analysis with fuzzy pair-wise comparisons. International Journal

of Approximate Reasoning 21: 215–3318. Holland J. H. (1975) Adaptation in Natural and Artificial Systems. University of Michigan

Press, Ann Arbor, MI9. Hwang C.L., Yoon K. (1981) Multi-Attribute Decision Making: Methods and Applications.

Springer-Verlag, New York, 198110. Ishizaka A., Balkenborg D., Kaplan T. (2005) AHP does not like compromises: the role of

measurement scales. Joint Workshop on Decision Support Systems, Experimental Economics& e-Participation: 45–54

11. Kotak D. B., Fleetwood M., Tamoto H., Gruver W. A. (2001) Operational scheduling forrough mills using a virtual manufacturing environment. Proc. of the IEEE International Con-ference on Systems, Man, and Cybernetics, Tucson, AZ, USA 1:140–145

12. Mamdani E.H., Assilian S. (1974) An Experiment in Linguistic Synthesis with a Fuzzy LogicController. International Journal of Man-Machine Studies 7:1–13

13. Mikhailov L. (2003) Deriving priorities from fuzzy pair-wise comparision judgements. FuzzySets and Systems 134: 365–385

14. Mizumoto M. (1989) Method of fuzzy inference suitable for fuzzy control. Journal of theSociety of Instrument Control Engineering 58: 959–963

15. Saaty T. L. (1980) The Analytic Hierarchy Process. McGraw Hill, New York, 1980.16. Saaty T.L. (1988) Multicriteria Decision Making: The Analytic Hierarchy Process. RWS Pub-

lications, Pittsburgh, PA17. Siu N., Elghoneimy E., Wang Y., Fleetwood M., Kotak D. B., Gruver W. A. (2004) Rough

mill component scheduling: Heuristic search versus genetic algorithms. Proc. of the IEEE Int.Conference on Systems, Man, and Cybernetics, Netherlands 5: 4226–4231

18. Triantaphyllou E., Lin C.-T. (1996) Development and evaluation of five fuzzy multiattributedecision-making methods. International Journal of Approximate Reasoning 14: 281–310

19. Triantaphyllou E. (1999) Reduction of pair-wise comparisons in decision making via a dualityapproach. Journal of Multi-Criteria Decision Analysis 8:299–310

Page 437: Forging New Frontiers: Fuzzy Pioneers I

Application of Fuzzy Systems to Wood Product Manufacturing 435

20. Uncu O. (2003) Type 2 Fuzzy System Models with Type 1 Inference. Ph.D. Thesis, Universityof Toronto, Toronto, Canada

21. Van Laarhoven M.J.P., Pedrycz W. (1983) A fuzzy extension of Saaty’s Priority Theory. FuzzySets and Systems 11: 229–241

22. Wang Y., Gruver W. A., Kotak D. B., Fleetwood M. (2003) A distributed decision supportsystem for lumber jag selection in a rough mill. Proc. of the IEEE International Conferenceon Systems, Man, and Cybernetics, Washington, DC, USA 1:616–621

23. Wang Y., Elghoneimy E., Gruver W.A., Fleetwood M., Kotak D.B. (2004) A fuzzy multipledecision support for jag selection. Proc. of the IEEE Annual Meeting of the North AmericanFuzzy Information Processing Society - NAFIPS’04 2: 27–30.

24. L. A. Zadeh (1965) Fuzzy Sets. Information and Control 8:338–353

Page 438: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logicsfrom Elementary Percepts: the ChecklistParadigm and the Theory of Perceptions

Ladislav J. Kohout and Eunjin Kim

1 Introduction

The Checklist paradigm provides an organizing principle by using of which oneinvestigates how and why specific systems of fuzzy logic emerge from more fun-damental semantic and pragmatic principles. The basic construct that this paradigmuses is a checklist. Checkliststs are useful in two distinct but complementary ways:

• As abstract formal constructs through which by a mathematically rigorous pro-cedure the semantics of a variety of fuzzy logic connectives emerge from basiccognitive principles (Bandler and Kohout 1980, 1986).

• In a concrete way, as a useful tool for acquiring the fuzzy membership functionin practical applications (Bandler and Kohout 1980, Kohout and Kim 2000).

The Checklist Paradigm (Bandler and Kohout 1980) also furnishes the semanticsfor both, point and interval-valued fuzzy logics. It clarifies the difference betweenthese two different kinds of logics, also showing their mutual relationship.

Through its cognitive interpretation the Checklist paradigm establishes the fun-damental link between fuzzy sets and Zadeh’s theory of perception (Zadeh 2001).The granulation of fuzzy classes depends on the selection of the elementary perceptswhich are captured as atomic properties appearing in the template of the correspond-ing checklist.

1.1 Plethora of Fuzzy Logic Connectives

It is well established now, that in fuzzy sets and systems we use a number of differ-ent many-valued logic connectives. In control engineering, in information retrieval,

Ladislav J. KohoutDepartment of Computer Science, Florida State University, Tallahassee, Florida 32306-4530, USA

Eunjin KimDepartment of Computer Science, John D. Odegard School of Aerospace Science, University ofNorth Dakota, Grand Forks ND 58202-9015, USA

M. Nikravesh et al., (eds.), Forging New Frontiers: Fuzzy Pioneers I. 437C© Springer 2007

Page 439: Forging New Frontiers: Fuzzy Pioneers I

438 L. J. Kohout, E. Kim

in analysis of medical data, or when analyzing data in aviation industry, differentmany-valued logics are used for different classes of application problems Althoughthe first Zadeh’s paper used max, and min, in one of his subsequent papers Zadehalready used other connectives, such as BOLD connective (now called Lukasiewiczt-norm). Currently, the most widespread systems of connectives are based ont-norms and implication operators linked together by residuation. Other systemslink AND connective with OR by means of DeMorgan rules, but some of thesesystems may have an implication that does not necessarily residuate. An exampleis min, max and Kleene-Dienes implication. In approximate reasoning, one is alsointerested in using a number of different connectives. This naturally motivates thecomparative study of different logic connectives from the point of view of theiradequacy for various applications.

1.2 Where did Fuzzy Logic Connectives Come From?

This practical need of knowing useful properties of different systems of fuzzy logicsleads to the question, if there are any underlying epistemological principles fromwhich different systems may be generated.

The answer to this question is provided by the Checklist Paradigm that was pro-posed by Bandler and Kohout in 1979. Since then a number of new results in thisarea have been obtained. The purpose of this paper is to survey the most importantresults to date.

2 Linking the Checklist Paradigm to the Theory of Perceptions

When dealing with fuzzy sets, we are concerned not only with semantics but alsowith epistemological issues. This concern manifests itself in three facets that aremutually interlinked (Zadeh 2001):

• Theory of preceptions,• Granularity, and• Computing with words.

The issue of indiscernibility appears deeply entrenched in the very foundations offuzzy logic in the wider sense. Indistiguishability (tolerance relation1 in mathemat-ical sense) is deeply embedded in the semantics of fuzzy logic, whichever (prag-matic) interpretation we use. According to Zadeh a granule is a fuzzy set of pointshaving the the form of a clump of elements drawn together by similarity.

The Checklist paradigm provides elementary percepts that participate in forminggranules and computing with words provides a framework for computations withgranules and interpreting their meaning.

1 Tolerance is a relation that is symmetric and reflexive.

Page 440: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 439

2.1 Basic Principles of the Checklist Paradigm

In this section we present the overview of the Checklist Paradigm semantics. Asthe Checklist Paradigm furnishes the semantics for both, point and interval-valuedfuzzy logics we also clarify the difference between these two types of logics as wellas their mutual relationship. The checklist paradigm puts ordering on the pairs ofdistinct implication operators and other pairs of connectives. It pairs two connectivesof the same type (i.e. two ANDs or two ORs or two Implications), thus providingthe interval bounds on the value of the fuzzy membership function. The intervallogic given by the pair of TOP and BOT connectives has as its membership functiona fuzzy valued function2 μ(Fuzz): X → P(R) with the rectangular shape. Hence itis a special case of fuzzy sets of the second type. We shall call such a logic systemchecklist paradigm based fuzzy logic of the second type or proper interval fuzzylogic.

The logic which has as its membership function a real valued function that yieldsa single point as the value of a logic formula, in the way analogous to the assignmentof values to the elements of ordinary fuzzy sets (i.e. fuzzy sets of the 1st type) iscalled a singleton logic, or checklist fuzzy logic of the first kind. In the logic of 2nd

type the atomic object, the basic element of the valuation space, is a subset of arectangular shape. In the logic of the 1st type, the atomic object is a singleton.

3 Checklist Paradigm: Its Mathematics and CognitiveInterpretation

First, we provide the checklist paradigm based semantics of a fuzzy membershipfunction for both kinds of logic, fuzzy point-logics and for fuzzy interval-logics.That is followed by deriving an alternative semantics of connectives for severalsystems of fuzzy logic by means of the checklist paradigm.

3.1 The Checklist, Degrees of Membership and Fuzzy Classes

A checklist template Q is a finite family of some items. For the purpose of this paper,we restrict ourselves to specific elementary items, namely properties (Q1, Q2, . . .,Qi, . . ., Qn). With a template Q, and a given proposition A, one can associate a spe-cific checklist QA = (Q, A) which pairs the template Q with a given proposition A.

A valuation gA of a checklist QA is a function from Q to {0,1}. Let us denoteby symbol aQ the degree δ(A) to which the proposition A holds with respect to atemplate Q. This degree is given by the formula

2 P(R) denotes the power set of Reals. An interval is an element of this powerset.

Page 441: Forging New Frontiers: Fuzzy Pioneers I

440 L. J. Kohout, E. Kim

aQ = δ(AQ) =n∑

i=1

a Ai , where n = card (Q) and q A

i = gA(Pi ) (1)

The checklist (Bandler and Kohout 1980) is used by an observer to assess thedegree of membership of some thing to a specific fuzzy class hence such a class isa cognitive construct. This thing can be an abstract object, a linguistic statement, alogic sentence or a concrete physical object. An observer can be an abstract math-ematical construct, an identification algorithm, a sensor or a measuring device or areal person. This bridges the gap between pure mathematical theory of fuzzy mem-bership function and its empirical acquisition.

Each different choice of properties for the repertory of properties {Qi} definesa different specific class. The degree to which a given object belongs to a chosenclass is then obtained by comparison of the properties of a specific checklist withthe properties of this object. So δ(A) is in fact a fuzzy membership function.

3.1.1 Cognitive interpretation

The checklist can be given cognitive interpretation. In this interpretation, theproperties {Qi} listed in a checklist are some elementary percepts. So this establishesthe fundamental link of fuzzy sets to the theory of perception. It can be seen thatgranulation of fuzzy classes depends on the choice of the properties {Qi} of a corre-sponding checklist. This idea was first published in (Bandler and Kohout 1980),together with the mathematics of the coarse and fine structures obtained by themeans of the checklist.

Furthermore, the interdependency of granules as well as of different levels ofgranulation can be investigated by the analysis of inter-dependency of variouschecklists. Non-associative triangle products of relations can be used for this pur-pose (Bandler and Kohout 1980, Kohout 2001,2004, Kohout and Kim 2002).

3.2 The Emergence of Fuzzy Logic Connectives

Having just the valuations of logic formulas that provide membership functionsis not enough. Membership functions are combined together by logic operationsthat are defined by many-valued logic connectives. In order to derive the semanticsof these connectives, we shall use two checklists. Let us assume that we have twodistinct checklists that are filled in by two observers, agents A and B (who may ormay not be in fact the same individual). Each observer uses the checklists assessinga different thing3. Each observer then fills in the relevant properties of the given

3 As remarked previously, this thing may be either an abstract object, a linguistic statement, a logicsentence or a concrete physical object, etc.

Page 442: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 441

object. As we already know, these properties provide the arguments for the valuationfunction.

It should be noted that the “observers” may be viewed in two ways:

• In an abstract way, as a mathematical procedure, and• In a concrete way, as a mechanism for constructing the fuzzy membership func-

tion from some empirical material.

3.2.1 Fine and Coarse Structure

A fine valuation structure of a pair of propositions A, B with respect to the tem-plate Q is a function from Q into { 0,1} assigning to each attribute Pi the orderedpair of its values (Q A

i , QBi ). The cardinality of the set of all attributes Pi . is denoted

by α j,k We have the following constraint on the values: α00 + α01 + α10 + α11 = n.Further, we define r0 = α00+α01, r1 = α10+α11, c0 = α00+α10, c1 = α01+α11.These entities can be displayed systematically in a contingency table.(see Table 1).The four α j,k of the contingency table constitute its fine structure. The marginsc0, c1, r0, r1 constitute its coarse structure. The coarse structure imposes boundsupon the fine structure, without determining it completely. Hence, associated withthe various logical connectives between propositions are their extreme values. Thuswe obtain the inequality restricting the possible values of mi (F). See Bandler andKohout (1980, 1986).

3.2.2 Valuation of the Fine Structure and its Cognitive Interpretation

Now let F be any logical propositional function of propositions A and B. For i, j ∈{0, 1} let f (i, j) be the classical truth value of F for the pair (i,j) of truth values; let

u(i, j) = αi j

n. Then we define the a (non-truth-functional) fuzzy assessment of the

truth of the proposition F(A,B) to be

m f in(F(A, B)) =∑i, j

f (i, j) · ui j (2)

This assessment operator will be called the value of the fine structure.

Table 1 Checklist Paradigm Contingency Table

No for B Yes for B Row total

No for A α00 α01 r0Yes for A α10 α11 r1Column Total c0 c1 n

Page 443: Forging New Frontiers: Fuzzy Pioneers I

442 L. J. Kohout, E. Kim

3.2.3 Percepts and Observers

When human observers make summarization of observational data into percepts,the fine structure of the observed properties on basis of which the summarizationis performed may be hidden. In other words, a summarization is a cognition basedmeasure that acts on the perceived properties. This may be conscious or subcon-scious.

We can capture this summarization mathematically by assigning a measure tothe fine structure (the interior part of the contingency table). Different measures canbe assigned, each having a different cognitive and epistemic interpretation (Bandlerand Kohout 1980).

Let us look at the ways that Bandler and Kohout (1980) used the measures forsummarization of the fine structure of the contingency tables of the checklists.

Two observers A and B , (which may or may not be in fact the same individual)are assessing two different objects V and W (parts in engineering, patients in clinicalobservations, etc.) using each her/his own checklist, as follows:

• A uses the list on V ,• B uses the list on W .

Both checklists are derived from the same checklist template, i.e. they both containthe list of the same properties.

Then, we wish to assign a measure to the degree to which observer A’s saying“yes” to items on the her/his checklist implies observer B’s saying “yes” on to thesame items on her/his checklist; in brief words, a measure of the support by the finedata of the checklist4 give to the statement “if yes-A then yes-B” while referring tothe same item ( property) on her/his respective checklist.

3.2.4 Assigning Measures to Percepts and Providing EpistemologicalInterpretation

In classical logic, if yes-A then yes-B is satisfied by “performance” wheneveryes-A and yes-B occur together , and “by default” whenever no-A occurs, regardlessof the B’s answer. Thus all entries support the statement except for α10. Thus if(in ignorance of any yet finer detail) we weight all items equally, the appropriateclassical measure of support is

m1 = α00 + α01 + α11

α00 + α01 + α10 + α11= 1− α10

n. (3)

The selection of different αi j s of the contingency table (Fig. 1) as the compo-nents of the measure formula yields different types of logic connectives. Bandlerand Kohout (1980) dealt with five implication measures: m1 to m5. Their paper

4 summarized in the fine structure of the contingency table.

Page 444: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 443

&

Fig. 1 Directed Graph of Transformations of 8-group

contains wealth of other material on implication operators and their properties Herewe shall deal only with the first three measures.

Classical approach of the first measure is not the only possible one. Another pointof view, worthy of attention, says that only the cases in which an item was checked“yes-A” are relevant, that is only the cases of satisfaction “by performance”. In thisview the appropriate measure would be a performance measure:

m2 = α11

α10 + α11= 1− α10

r1. (4)

Still another point of view wishes to distinguish the proportions of satisfaction“by performance” α11/n” and “by default” r0/n, and to assign as measure the betterof the two, thus

m3 = α11 ∨ (α00 + α01)

n. (5)

Each measure generates a different implication operator as shown in Table 2.

5 ∨,∧ are max, min respectively.

Page 445: Forging New Frontiers: Fuzzy Pioneers I

444 L. J. Kohout, E. Kim

Table 2 Bounds on Implication Measures

Measure BOT-connective TOP-connective

m1 = 1− (α10/n) max(1-a, b)Kleene-Dienes implication

min(1, 1-a+b)Lukasiewicz implication

m2 = 1− (α10/r1) max(0,a + b − 1

a) min(1, b/a)

Goguen G43 implication

m3 = a11 ∨ (a00 + a01)

nmax(a+b-1, 1-a) EZ (Zadeh) Implication5

(a ∧ b) ∨ (1 − a)

The bounds on normalized values6 of the interior of the contingency table are dis-played in the two constraint structures, labeled as MINDIAG and MAXDIAG. Thefirst is obtained by minimizing the values of the diagonal, the second by maximizingthese.

3.2.5 Fuzzy Membership and the Emergence of Intervals

We have seen that the checklist paradigm puts ordering on the pairs of distinct im-plication operators. Hence, it provides a theoretical justification of interval-valuedapproximate inference. The measures so far discussed were implicative measuresyielding implication operators. We shall denote these7 as m(Ply).

Other connectives have their measures as well. For example on may have

m1(&) = α11

n= u11, m1(∨) = α00 + α01 + α11

n= u00 + u01 + u11. (6)

m1(⊕) = α01 + α10

n= u01 + u10, m1(≡) = α00 + α11

n= u00 + u11. (7)

Table 3 MINDIAG and MAXDIAG constraints

6 Normalized values are values given by ui j . These can be translated directly into the names ofargument variables as shown in Table 3.

7 Ply is a neutral name for implication operator that does not put any ontological connotationonto it.

Page 446: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 445

Based on the above observations one can make the following statements:

• A specific approximation measure m(F), provides the assessment of the amountof elementary percepts participating in formation of the membership functionwithin the framework of the fine structure of the checklist.

Within the framework of the coarse structure the Checklist Paradigm generatespairs of distinct connectives BOT and TOP of the same logical type that determinethe end points of intervals of fuzzy membership values expressed by m(F). Thuswe have the following general form of inequality:

conB OT ≤ m(F) ≤ conT O P (8)

For the m1 measure, there are 16 inequalities linking the TOP and BOT types ofconnectives thus yielding 16 logical types of TOP and BOT pairs of connectives.Ten of these interval pairs generated by m1 are listed in Table 4. All 16 pairs ofconnectives derived from m1 are dealt with in (Bandler and Kohout 1982, 1986).The most up-to-date account of the system based on m1 appears in (Kohout andKim 2004).

3.2.6 Natural Crispness and Fuzziness

The width of the interval produced by an application of a pair of associated connec-tives (i.e. TOP and BOT connectives) characterizes the margins of imprecision ofan interval logic expression.

The interval between the TOP connective and the BOT connective is directlylinked to the concept of fuzziness for system m1. We define the un-normalizedfuzziness of x (cf. Bandler and Kohout 1980a) as φ(x) = min(x, 1 − x) then forx in the range [0, 1], φ(x) is in the range [0, 1], with value 0 if and only if x is

Table 4 Connectives of System with m1

Logical Type Valuation

Proposition Connectives BOTTOM TOP

Conjunction(AND) a & b max(0, a+b-1) ≤ min(a,b)Non-Disjunction(Nicod) a ↓ b max(0, 1-a-b) ≤ min(1-a, 1-b)Non-Conjunction(Sheffer) a | b max(1-a, 1-b) ≤ min(1, 2-a-b)Disjunction(OR) a ∨ b max(a, b) ≤ min(1, a+b)Non-Inverse Implication a←| b max(0, b-a) ≤ min(1-a, b)Non-Implication a→| b max(0, a-b) ≤ min(a, 1-b)Inverse Implication a← b max(a, 1-b) ≤ min(1, 1+a-b)Implication a→ b max(1-a, b) ≤ min(1, 1-a+b)Equivalence (iff) a ≡ b max(1-a-b, a+b-1) ≤ min(1-a+b, 1+a-b)Exclusion(eor) a ⊕ p max(a-b, b-a) ≤ min(2-a-b, a+b)

Page 447: Forging New Frontiers: Fuzzy Pioneers I

446 L. J. Kohout, E. Kim

crisp, and value 0.5 if and only if x is 0.5.The crispness is defined as the dual offuzziness8.

The Gap Theorems (6) and (7) (Bandler and Kohout 1986) determine the widthof the interval as a function of the fuzziness of the arguments of a logic connective.

(a ≡T O P b − a ≡B OT b) = (a ⊕T O P b − a ⊕B OT b) = 2 min(φa, φb) (9)

(a&T O P b − a&B OT b) = (a ∨T O P b − a ∨B OT b) = (10)

= (a→T O P b − a →B OT b) = min(φa, φb)

3.3 Cognitive Aspects

Linguists distinguish the surface from the deep structure of language. The deepstructure is hidden in the cognitive structures of the brain, while the surface structuremanifests itself in utterances of sentences by a speaker.

Similar relationship obtains between the fine and the coarse structures of thefuzzy statements. The fine structure is a deep structure that may or may not be hid-den, while the coarse structure is on the surface and manifests itself in approximatereasoning.

4 Groups of Transformations

Logic transformations are useful in investigating the mutual interrelationships oflogic connectives. The global structure of systems of logic connectives can be fruit-fully studied by employing the abstract group properties of their group transforma-tions. A specific abstract group provides a global structural characterization of aspecific family of permutations that concretely represent this abstract group. Thisidea can be used for global characterization of logic connectives.

4.1 Abstract Groups and their Realization by the Groupsof Transformations

An abstract group captures the general structure of a whole class of systems of log-ics. The concrete realization of such an abstract group is a group of transformations,which captures how the connectives transform to each other. This, of course, alsocaptures connectives interactions.

8 This idea of natural crispness of a fuzzy proposition as κa = a ∨ (1 − a) and fuzziness as itsdual φa = 1− κa first appeared in (Bandler and Kohout 1978).

Page 448: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 447

An abstract group has to be distinguished from its realization by transformations(Weyl 1950). A realization consists in associating with each element a of the abstractgroup a transformation T (a) of a point-field9 in such a way that to the compositionof elements of the group corresponds composition of the associated transformationsT (ba) = T (b) T(a).

To each point of the field over which the transformations are applied may cor-respond some structure, and transformations capture the laws of dependencies ofthese structures on each other. In our case, the points of the field of elements oftransformations are the individual logic connectives, the inner structure of which isdetermined by the logic definitions of these connectives.

4.2 Transforming Logics with Commutative AND and OR

4.2.1 Piaget Group of Transformations

A global characterization of two-valued connectives of logic using groups was firstgiven by a Swiss psychologist Jean Piaget in the context of studies of human cog-nitive development. Given a family of logical connectives one can apply to thesevarious transformations. Individual logic connectives are 2-argument logic func-tions. Transformations are functors that, taking one connective as the argument willproduce another connective. Let 4 transformations on basic propositional functionsf(x,y) of 2 arguments be given as follows:

I ( f ) = f (x, y) : IdentityD( f ) = ¬ f (¬x,¬y) : DualC( f ) = f (¬x,¬y) : ContradualN( f ) = ¬ f (x, y) : Negation

(11)

The transformations {I, D, C, N} are called identity, dual, contradual, negationtransformation, respectively. It is well known that for the crisp (2-valued) logic thesetransformations determine the Piaget group. This group of transformations is a re-alization of the abstract Klein 4-element group. The group multiplication table isshown as (12) below.

◦ I C N DI I C N DC C I D NN N D I CD D N C I

(12)

9 A transformation is a relation with special properties (i.e. a function), that has its domain X andrange Y . These two taken together (i.e. X ∪ Y ) form a set called field.

Page 449: Forging New Frontiers: Fuzzy Pioneers I

448 L. J. Kohout, E. Kim

It is well known that the Piaget group of transformation is satisfied by somemany-valued logics (Dubois and Prade 1980).For all these logics described in thequoted reference, the connectives characterized by the Piaget group are just a singlefamily that can be used for point-based many-valued logic inference. By point-basedwe mean an inferential system where any formula has just a single truth-value fromthe interval [0, 1]. (Kohout and Bandler 1979, 1992) however, demonstrated thatthese transformations also apply to interval fuzzy logics. This has been further inves-tigated in the subsequent papers. See the list of references in (Kohout and Kim 2004)for further details.

4.2.2 Kohout and Bandler Group of Transformations

Adding new non-symmetrical transformations to those defined by Piaget enrichesthe algebraic structure of logical transformations. (Kohout and Bandler 1979) addedthe following non-symmetric operations to the above defined four symmetric trans-formations:

LC( f ) = f (¬x, y), RC( f ) = f (x,¬y)

L D( f ) = ¬ f (¬x, y), RD( f ) = ¬ f (x,¬y)(13)

The group multiplication table that includes the asymmetrical transformations givenby (13) is shown in Table 5.

It is an 8-element symmetric group S2×2×2. It can be seen that it contains theabstract Klein 4-group as its subgroup.

In order to see the link of these abstractions to logic we have to describe theconcrete realization of the abstract group by means of the transformations of Piagetwith added extensions provided by Kohout and Bandler (1979). We can see thatthe transformations discovered by Piaget are symmetrical in arguments x, y withrespect to the application of negation, while those provided by Kohout and Bandlerare asymmetrical in x, y.

The point-field of the group of transformations that realizes the 8-element sym-metric abstract group (cf. Table 5) consists of the following set of points CON. Thetransformations Tp are listed in Table 6 together with the set CON.

It is revealing to display the concrete group of transformations as a graph where

Table 5 Abstract Symmetric 8-element group

o I C N D LC RC LD RD

I I C N D LC RC LD RDC C I D N RC LC RD LDN N D I C LD RD LC RCD D N C I RD LD RC LCLC LC RC LD RD I C N DRC RC LC RD LD C I D NLD LD RD LC RC N D I CRD RD LD RC LC D N C I

Page 450: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 449

Table 6 Group of Transformations realizing the abstract 8-element group

CON = {&,↓, |,∨,←|,→|,←,→}

TP =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

I =(

&,↓, |,∨,←|,→|,←,→&,↓, |,∨,←|,→|,←,→

), D =

(&,↓, |,∨,←|,→|,←,→∨, |,↓, &,→,←,→|,←|

),

C =(

&,↓, |,∨,←|,→|,←,→↓, &,∨, |,→|,←|,→,←

), N =

(&,↓, |,∨,←|,→|,←,→|,∨, &,↓,←,→,←|,→|

),

LC =(

&,↓, |,∨,←|,→|,←,→←|,→|,←,→, &,↓, |,∨

), RC =

(&,↓, |,∨,←|,→|,←,→→|,←|,→,←,↓, &,∨, |

),

L D =(

&,↓, |,∨,←|,→|,←,→←,→,←|,→|, |,∨, &,↓

), RD =

(&,↓, |,∨,←|,→|,←,→→,←,→|,←|,∨, |,↓, &

)

⎫⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎭

the graph edges are labeled by the names of transformations that were defined ab-stractly by (11) and (13) listed above.

5 Group Representation of the Checklist Paradigm IntervalBased Fuzzy Logic Systems

In this section we shall look at system m1 the transformation group representationof which is a representation of 8 element symmetric S2×2×2 abstract group intro-duced and further described above. This group includes implications, AND, ORconnectives and some other connectives forming a closed system of connectives.Transformations of another closed system of connectives, containing equivalenceand exclusive-or form a subgroup of the S2×2×2 group.

5.1 Interval System m1 with Lukasiewicz and Kleene-DienesBounds

The full systems of connectives we are concerned with contain 16 connectives, tenof which are genuinely two argument. The following two theorems are concernedwith closed subsystems that contain eight connectives.

5.1.1 Group Representation of Subsystems with Eight Connectives

Theorem 1. (Kohout and Bandler 1992)The closed set of connectives generated by the Lukasiewicz implication operator

a →5 b = min(1, 1 − a + b) by the transformation T is listed below. This set ofconnectives together with T is a realization of the abstract group S2×2×2.

Page 451: Forging New Frontiers: Fuzzy Pioneers I

450 L. J. Kohout, E. Kim

Connective Transformation Type of Interval Boundg1 = I (→5) = min(1, 1− a + b) →T O P

g2 = C(→5) = min(1, 1+ a − b) ←T O P

g3 = D(→5) = max(0, b− a) �B OT

g4 = N(→5) = max(0, a − b) �B OT

g5 = LC(→5) = min(1, a + b) ∨T O P

g6 = L D(→5) = max(0, 1− a − b) ↓B OT

g7 = RC(→5) = min(1, 2− a − b) |T O P

g8 = RD(→5) = max(0, a + b − 1) B OT

Theorem 2. (Kohout and Kim 1997).The closed set of connectives generated by the Kleene-Dienes implication oper-

ator a →6 b = max(1 − a, b) by the transformation T is listed below. This set ofconnectives together with T is a realization of the abstract group S2×2×2.

Connective Transformation Type of Interval Boundg1 = I (→6) = max(1− a, b) →B OT

g2 = C(→6) = max(a, 1− b) ←B OT

g3 = D(→6) = min(1− a, b) �T O P

g4 = N(→6) = min(a, 1− b) �T O P

g5 = LC(→6) = max(a, b) ∨B OT

g6 = L D(→6) = min(1− a, 1− b) ↓T O P

g7 = RC(→6) = max(1− a, 1− b) |B OT

g8 = RD(→6) = max(a, b). &T O P

5.1.2 4-Group Structure of the Group Representation of the Subsystem{≡T O P ,⊕B OT ,≡B OT ,⊕T O P } of m1

It is interesting to see that the Klein group which is a subgroup of S2×2×2 groupcharacterizes the structure of some TOP – BOT pairs of interval logic systemsgenerated by the checklist paradigm. Thus application of T to the set of intervalconnectives {≡T O P ,⊕B OT ,≡B OT ,⊕T O P } yields the representation that has thestructure of the Klein group.

5.1.3 The TOP and BOT Bounds–Preserving Interval Inferences

Definition 1. A subset A of a set B of connectives is an elementary closed set iff itis a knot in A. This means that

1. any element in B is reachable from any other element in B, and

Page 452: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 451

2. no element outside B is reachable from B by any number of applications of trans-formation T . The closure is relative with respect to application of a specific setof transformations T .

Definition 2. A pure set of connectives is a set that contains

1. only the connectives of one classification type (e.g. only TOP, or only BOT, oronly Min-diag, or only Max-diag, etc.).

2. It is an elementary closed set.

Theorem 3. (Kohout and Kim 1997).The following sets of connectives are pure sets of connectives with respect to

transformation T{I, C, LC, RC}:

1) The set of TOP connectives {&, ↓,←|,→| }. This set is TOP-bound preserving.2) The set of BOT connectives {&, ↓,←|,→| }. This set is BOT-bound preserving.3) The set of TOP connectives {|, ∨,←,→}. This set is TOP-bound preserving.4) The set of BOT connectives {|, ∨,←,→}. This set is BOP-bound preserving.

5.1.4 Max-diag and Min-diag Classification of Interval Connectives

We have seen that the interval TOP and BOT classes of connectives represent the ex-tremes of the fine structure that is captured by the contingency tables. Contingencytables acquire the extreme values for minimal and maximal values on its diagonal,respectively. This yields another classification of interval connectives. This classifi-cation reflects the properties of the fine structure, of which the coarse structure is anapproximation.

• A contingency table that has a zero on the diagonal, i.e. αii = 0 yields a connec-tive of Min-diag type.

• A contingency table that has a zero off its diagonal that is where αi j = 0 orα j i = 0 and j �= i , yields a connective of Max-diag type.

Theorem 4. (Kohout and Kim 1997).The following sets of connectives are pure sets of connectives with respect to

transformation T{I, N, C, D}:

1) The set of Max-diag connectives {&TOP, ↓TOP, |BOT, ∨BOT}.2) The set of Min-diag connectives {&BOT, ↓BOT, |TOP, ∨TOP}.3) The set of Min-diag connectives {←−BOT, −→BOT,←|TOP,→|TOP}.4) The set of Max-diag connectives {←−TOP, −→TOP,←| BOT,→| BOT}.

5.2 Classification of Connectives

From the previous theorems we can see that there are at least 3 kinds of classificationof connectives generated by the checklist paradigm semantics, namely,

Page 453: Forging New Frontiers: Fuzzy Pioneers I

452 L. J. Kohout, E. Kim

• Interval-based: TOP-BOT interval pair giving the bounds on the values of the in-terval. See Table 4 above. For further details see (Bandler and Kohout 1982, 1986)

• Constraint-based (Theorem 4): Maxdiag-Mindiag characterising the fine struc-ture of the checklist paradigm. For further detais see (Kohout and Kim 1997).

• Group Transformation-Based: which is studied in great detail in (Kohout andKim 2004). The subset {≡T O P ,⊕B OT ,≡B OT ,⊕T O P } given above was just anexample of one of the subsystems that fall into this classification .

5.3 Non-commutative Systems of Fuzzy Logics

Recently several works have been published on systems of fuzzy logics with non-commutative AND and OR connectives. It should be noted that the Checklistparadigm has provided semantics for some of these as well. In particular, systemsbased on m2 and m3 measures introduced above are in this category. We shalloverview in some detail system m2.

It is obvious that a non-commutative system of logic has to contain two AND andalso two OR connectives. As a consequence of non-commutativity it will also con-tain two implication operators (denoted by right arrows→) and two co-implicationoperators← . As the consequence some other connectives will be also in duplicate.We denote the duplicate connectives by bullets attached to the symbol of the con-nective. For the purpose of capturing non-commutativity, a new transformation willbe introduced, namely a commutator K that is added to the eight already introducedtransformations:

TP = {I, C, D, N, LC, L D, RC, RD, K } (14)

This is applied to AND and OR connectives. The following expression is theequational definition of the commutator:

a&•b = K (a&b) = b&a; a ∨• b = K (a ∨ b) = b ∨ a. (15)

It is clear that for any connective ∗ the commutator yields a ∗ b = K(b ∗ a).

5.4 Commutativity and Contrapositivity of Connectives:Its Presence or Absence

Definition 3. An implication operator is contrapositive, if its valuation satisfies thesemantic equality. Otherwise it is not contrapositive.

The implication operators of Kleene-Dienes (plybot) and Lukasiewicz (plytop)appearing in the interval system m1 are contrapositive. When the implication oper-ator→ is not contrapositive, the AND and OR connectives generated by the appli-cation of the group-compliant logic transformation T to→ are non-commutative.

Page 454: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 453

Theorem 5. Let S2×2×2 be represented by the set oftransformations T{I, N, C, D, LC, RC, LD, RD} applied to the set

C O N = {&,∨,↓, |,−→,←−,→| ,←|}

Then, if→ is contrapositive, the corresponding &and ∨ in CON must be commuta-tive. For the proof see (Kohout and Kim 1997).

It is clear that for any connective ∗ the commutator yields a∗b = K (b∗a). We saythat a pair of two–argument logic sentences is pairwise commutative if and only if

a∗b = K (b∗a). (16)

Commutativity involves restrictions on transformations of connectives, as doesthe contrapositivity.

Theorem 6. • For any contrapositive→ the following equalities hold:

C[K (a→ b)] = K [C(a→ b)] = a→ b (17)

• For a non-contrapositive (1) above fails, but the following equalities hold:

K (C(K (C(a→ b)))) = a→ b. (18)

For the proof see (Kohout and Kim 1997).

5.5 The Group of Transformations of the Interval Systemm2 of Connectives

First, we shall examine two closed subsystems generated by the implications of thissystem. The abstract group structure of these is captured by S2×2×2. Each of thesesubsystems contains one AND and one OR of the non-commutative pair.of theseconnectives. This will be followed by the presentation of the 16 element S2×2×2×2abstract group and by the graph of corresponding transformation group representa-tion.

Theorem 7. The following closed set of connectives is generated from the ply-topimplication operator a→4 b = min(1, b/a) by the transformation TP :

Page 455: Forging New Frontiers: Fuzzy Pioneers I

454 L. J. Kohout, E. Kim

→= I (→4) = min(1, b/a)

←•= C(→4) = min(1, 1− b/1− a)

←|• = D(→4) = max(0, b − a/1− a)

→| = N(→4) = max(0, a − b/a)

∨ = LC(→4) = min(1, b/1− a)

↓= L D(→4) = max(0, 1− a − b/1− a)

| = RC(→4) = min(1, 1− b/a)

& = RD(→4) = max(0, a + b − 1/a).

This set of connectives together with TP is a realization of the abstract groupS2×2×2. It is a closed subsystem of connectives:

This is, however not the full story. There is another representation that gener-ates the same abstract group S2×2×2. It is generated by the co-implication operatora←4 b = min(1, a/b):

Theorem 8. The following closed set of connectives is generated from the ply-topco- implication operator a←4 b = min(1, a/b) by the transformation TP:

←= I (←4) = min(1, a/b)

→•= C(←4) = min(1, 1− a/1− b)

→|• = D(←4) = max(0, a − b/1− b)

←| = N(←4) = max(0, b− a/b)

∨• = LC(←4) = min(1, a/1− b)

↓•= L D(←4) = max(0, 1− b − a/1− b)

|• = RC(←4) = min(1, 1− a/b)

&• = RD(←4) = max(0, b+ a − 1/b).

The transformation group of this set is a representation of S2×2×2 abstract group.It is a closed subsystem of connectives:

Corollary 9. Application of the commutator K to any connective from a closed sys-tem of connectives given by Theorem 7 yields the connective of the same type thatbelongs to the system of closed connectives of Theorem 8 and vice versa.

Proof. A direct application of the commutator K defined by the (16) above, to aconnective from one subsystem yields a corresponding connective of the same typein the other subsystem

Non-commutative systems contain two AND and two OR connectives. Theclosed system of Theorem 7 contains & and ∨ connectives. The closed system ofTheorem 8 contains their counterparts, &· and ∨· connectives. These subsystemsare linked together by the commutator as shown in Collorary 9.Ten two-argument

Page 456: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 455

Table 7 Interval connectives emerging from measure m2

Bot: C O Nbot (a, b) C O Ntop(a, b) Top:

& max(0,a+b-1/a) ≤ min(1,b/a) →←| max(0,(b-a)/b) ≤ min(1,(1-a)/b) |•↓ max(0,(1-a-b)/(1-a)) ≤ min(1,(1-b)/(1-a)) ←•→|• max(0,(a-b)/(1-b)) ≤ min(1,a/(1-b)) ∧•→| max(0,(a-b)/a ≤ min(1,(1-b)/a) |&• max(0,(a+b-1)/b) ≤ min(1,a/b) ←←|• max(0,(b-a)/(1-a)) ≤ min(1,b/(1-a)) ∧↓• max(0,(1-a-b)/(1-b)) ≤ min(1,(1-a)/(1-b)) →•

connectives of the system based on measure m2 are shown in Table 7. All sixteenconnectives are derived and further examined in (Kohout and Kim 2005).

The non-commutative interval system that emerges from the fine structure of theChecklist paradigm by application of the measure m2 has the 16 element transfor-mation group the structure of which is captured in Fig. 2.

The next question that naturally arises is: “What abstract group is representedby Fig. 2?” The multiplication table of this group shown in Table 8 answers thisquestion.

Further details concerning the non-commutative interval system based on themeasure m2 which contains the Goguen implication and co-implication can befound in (Kohout and Kim 1997, 2005). The group classification of the subsystems

Fig. 2 Graph of Transformations of 16-symmetric group

Page 457: Forging New Frontiers: Fuzzy Pioneers I

456 L. J. Kohout, E. Kim

Table 8 Abstract Symmetric 16-element Group S2×2×2×2 ·

of connectives of this non-commutative system appears in (Kohout and Kim 2005).It looks at the structure of the system in terms of the subgroups of S2×2×2×2.

5.6 Interval versus Point Fuzzy Logics

Fuzzy interval logic systems are usually treated as completely different from fuzzypoint logics. It is often believed that these two kinds of logics are not linked byany mathematical theory or foundational ontological and epistemological principles.We shall demonstrate that the Checklist paradigm semantics provide such unifyingprinciples for both.

6 Distinguishing Checklist Paradigm Fuzzy Logics of the Secondand the First Kind

In order to clarify these issues we have to provide some meaningful definitions first.

Definition 4. Fuzzy singleton logic (also called point logic)has as its membership function a real valued function that yields a single point

as the value of a logic formula, in the way analogous to the assignment of values tothe elements of ordinary fuzzy sets (i.e. fuzzy sets of the first type). Hence this systemshould be called a singleton logic, or the checklist fuzzy logic of the first kind.

Page 458: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 457

Definition 5. Fuzzy interval logicwhich will also be called proper interval fuzzy logic has as its membership func-

tion a fuzzy valued function μ(Fuzz): X → P(R) with the rectangular shape. It is aspecial case of fuzzy sets of the second type. Hence we shall call it also fuzzy logicof the second kind.

The interval logic derived with help of m1 (see Table 2 above) is a proper intervallogic – it has a membership function with the rectangular shape. Now we have todemonstrate under which conditions it may collapse into m1 based singleton fuzzylogic, i.e. a logic of the first kind.

6.1 A Measure for the Expected Case

Systems m1 to m5 are pure fuzzy interval logics, not involving any probabilisticnotions. Bandler and Kohout (1980), however, also asked the question as to howthe fine structure can be characterized by expected values. When only the row andcolumn totals ri , c j of the fine structure captured by the contingency table are known(see Fig.1 above), one can ask what are the expected values for the αi j . It can be seenthat the ways in which numbers can be sprinkled into the cells of the contingencytable so as to give the fixed coarse totals constitute a hyper–geometric distribution.

The inequalities determining the interval formed by BOT and TOP pair of con-nectives of the same logical type change into equalities if the probabilistic notionof expected value is introduced as an additional constraint. The interval collapsesinto single point, the expected value, and the MID connective emerges (Bandler andKohout 1980).

The expected values for the measure m1 and the corresponding logical typesof MID connectives are listed above in Table 9; see also Fig. 1 in (Bandler andKohout 1986) Their paper lists all 16 connectives. Interestingly, the implicationoperator that emerges from the checklist paradigm derivation is Reichenbach’s im-plication operator that was introduced by Hans Reichenbach in late 1930s.

Table 9 Expected case, system m1 – Fuzzy logic of the Type1 (point logic)

Logical type Connective Expected value

AND a ∧ b abOR a ∨ b a + b − abImplication a→ b 1− a + abInverse Implication a← b 1− b + abNon-Implication ¬(a→ b) a(1 − b)

Non-Inverse Implication ¬(a← b) (1− a)bEquivalence a ≡ b (1− a)(1 − b)+ abExclusive OR a ⊕ b (1− a)b + a(1− b)

Nicod ¬(a OR b) (1− a)(1 − b)

Sheffer ¬(a AND b) 1− ab

Page 459: Forging New Frontiers: Fuzzy Pioneers I

458 L. J. Kohout, E. Kim

6.2 Fuzzy Singleton and Interval Logics Revisited: an OntologicalDistinction

The previous section demonstrated that the epistemology offered by the Checklistparadigm could yield ontological distinctions10 between these two kinds of fuzzylogics.

The bounds induced by m1 described by (1) and also shown in Table 3 aboveas distinctions between Maxdiag and Mindiag of course still hold for these MIDconnectives. What changes is the character of the space of values which the lowerand upper bounds jointly delimit. In the logic of 2nd type the atomic object, thebasic element of the valuation space is a subset of a rectangular shape. In the logicof the 1st type, the atomic object is a singleton. The constraints that delimit theatomic objects of the logic of the 1st type are tighter, and contain more informationdetermining a specific single connective on which the bounds of the interval operate.In the expected case for m1 these connectives are probabilistic, i.e. the connectivesof the Reichenbach logic.

In case of logic of the 2nd type we do not have the information about the exactnature of the connective that is constrained by the lower and the upper bound form-ing the interval. For example, in the case of AND, it could be any continuous t-norm(or more generally any copula).

7 Conclusion

The Checklist paradigm has clarified the following issues:

• Many-valued logic connectives which form the basis for operations on fuzzy setsemerge from more basic epistemological principles that can be given cognitiveinterpretation instead of being ad hoc postulated.

• Through the distinction between the fine and the course structure, the Checklistparadigm demonstrates that indiscernibility is related to the semantics of fuzzysets.

• Granules provided by the intervals are primary, the singleton membership func-tion emerges by the collapse of the intervals into points under special circum-stances.

• The Checklist paradigm also demonstrates how the probability emerges fromfuzziness under very special circumstances.

• This sheads some light on the issue of the ontology of subjective probabilities.Cognitive disposition to provide a measure for the expected case presupposes

10 The word ‘ontology’ here has a broader meaning than is customary in computational intelligenceor knowledge engineering. In this paper we use it in the broader sense in which it is used in logicand philosophy. Namely ‘ontology’ here means “What is there” – the basic notions under whichthe defined systems and their important features exist.

Page 460: Forging New Frontiers: Fuzzy Pioneers I

On the Emergence of Fuzzy Logics from Elementary Percepts: 459

some experience. This experience cannot be captured fully by counting frequen-cies or by postulating axioms for subjective probabilities.

• The Checklist paradigm indicates that this experience could be captured withinthe framework of Zadeh’s theory of perceptions (Zadeh 2001).

• The Checklist paradigm indicates that deeper cognitive justification of Zadeh’sTheory of perception could come from Piaget’s cognitive epistemology; in par-ticular through the Piaget logic transformation group which has been empiricallydiscovered and psychologically justified (Inherder and Piaget 1964).

References

Kohout LJ, Bandler W (1979) Checklist paradigm and group transformations (Technical NoteEES–MMS–ckl91.2,–1979, Dept of Electrical Eng. Science, University of Essex., UK)

Bandler W, Kohout LJ (1980) Semantics of implication operators and fuzzy relational products.Internat. J of Man-Machine Studies, 12:89–116. Reprinted in: Mamdani,E.H. and Gaines, B.R.(eds) Fuzzy Reasoning and its Applications. Academic Press, London, pp 219–246

Bandler W, Kohout LJ (1984) Unified theory of multiple-valued logical operators in the light of thechecklist paradigm. In: Proc. of the 1984 IEEE Conference on Systems, Man and Cybernetics,IEEE New York, pp 356–364

Bandler W, Kohout LJ (1986) The use of checklist paradigm in inference systems. In: Negoita CVand Prade H (eds) Fuzzy Logic in Knowledge Engineering, Chap. 7, Verlag TUV Rheinland,Koln, pp 95–111

Dubois D, Prade H (1980) Fuzzy Sets and Systems: Theory and Applications. Academic Press,New York.

Inhelder B, Piaget J (1964) The Early Growth of Logic in the Child. Norton, New York.Kohout LJ (2001) Boolean and fuzzy relations. In: Pardalos PM, Floudas CA (eds) The Encyclo-

pedia of Optimization Vol. 1 A–D. Kluwer, Boston, pp 189–202Kohout LJ (2001a) Checklist paradigm semantics for fuzzy logics. In: Pardalos PM, Floudas CA

(eds) The Encyclopedia of Optimization Vol. 1 A–D. Kluwer, Boston, pp 23–246Kohout LJ (2004) Theory of fuzzy generalized morphisms and relational inequalities. Internat J of

General Systems, 22:339–360Kohout LJ, Bandler W (1992) How the checklist paradigm elucidates the semantics of fuzzy infer-

ence. In: Proc. of the IEEE Internat Conf on Fuzzy Systems. IEEE, New York, 1 pp 571–578Kohout LJ, Bandler W (1992a) Use of fuzzy relations in knowledge representation, acquisition and

processing. In: Zadeh LA, Kacprzyk J (eds) Fuzzy Logic for the Management of Uncertainty.John Wiley, New York, pp 415–435

Kohout and Bandler (1993) Interval-valued systems for approximate reasoning based on the check-list paradigm. In: Wang, P (ed) Advances in Fuzzy Theory and Technology

Kohout LJ, Kim E (1997) Global characterization of fuzzy logic systems with paraconsistent andgrey set features. In: Wang P (ed) Proc. 3rd Joint Conf. on InformationSciences JCIS’97 (5thInt. Conf. on FuzzyTheory and Technology, Vol. 1 Fuzzy Logic, Intelligent Control and GeneticAlgorithms. Duke University, Research Triangle Park, NC, pp 238–241

Kohout LJ, Kim E (2000) Reasoning with cognitive structures of agents I: Acquisition of rulesfor computational theory of perceptions by fuzzy relational products. In: Da Ruan and KerreE (eds) Fuzzy IF-THEN Rules in Computational Intelligence Chap. 8, Kluwer, Boston,pp 161–188

Kohout LJ Kim E. (2002) The role of BK products of relations in soft computing. Soft Computing,6:89–91

Kohout LJ, Kim E (2004) Characterization of interval fuzzy logic systems of connectives by grouptransformations. Reliable Computing, 10:299–334

Page 461: Forging New Frontiers: Fuzzy Pioneers I

460 L. J. Kohout, E. Kim

Kohout LJ, Kim E (2005) Non-commutative fuzzy interval logics with approximation semanticsbased on the checklist paradigm and their group transformations. In: Proc. of FUZZ–IEEE2005.IEEE Neural Network Council, Piscataway, NJ, CD-ROM.

Zadeh LA (2001) From computing with numbers to computing with words – from manipulationof measurements to manipulation of perceptions. In: Wang PP (ed) Computing with Words.John Wiley, New York, pp 35–68