dunix student
TRANSCRIPT
EY–X794E–SG.0003
AdvFS Internals and Troubleshooting
Course Guide
EY–X794E–SG.0003
AdvFS Internals and Troubleshooting
Course Guide
Notice
The information in this publication is subject to change without notice.
COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OREDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL ORCONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, ORUSE OF THIS MATERIAL.
This guide contains information protected by copyright. No part of this guide may be photocopied orreproduced in any form without prior written consent from Compaq Computer Corporation.
The software described in this guide is furnished under a license agreement or nondisclosure agreement.The software may be used or copied only in accordance with the terms of this agreement.
Other product names mentioned herein may be trademarks and/or registered trademarks of theirrespective companies.
©2000 Compaq Computer Corporation. All rights reserved. Printed in the USA.
Aero, ALPHA, ALPHA AXP, AlphaServer, AlphaStation, Armada, BackPaq, COMPAQ, CompaqInsight Manager, CompaqCare logo, Counselor, DECterm, Deskpro, DIGITAL, DIGITAL logo,DIGITAL Alpha Systems, Digital Equipment Corporation, DIGITAL UNIX, DirectPlus, FASTART,Himalaya, InfoPaq, Integrity, LicensePaq, Ministation, NetFlex, NonStop, OpenVMS, PaqFax,Presario, ProLiant, ProLinea, ProSignia, QuickBack, QuickFind, Qvision, RDF, RemotePaq, RomPaq,ServerNet, SERVICenter, SmartQ, SmartStart, SmartStation, SolutionPaq, SpeedPaq, StorageWorks,Systempro/LT, Tandem, TechPaq, TruCluster, Tru64 UNIX, registered in United States Patent andTrademark Office.
Atalla, C-Series, Expand, FOX, Guardian, iTP, Measure, Netelligent, and PointView are trademarks ofCompaq Computer Corporation.
Microsoft, Windows, and Windows NT are registered trademarks of Microsoft Corporation.MIPS is a trademark of MIPS Computer Systems. Motif, OSF and OSF/1 are registered trademarks ofthe Open Software Foundation. NFS is a registered trademark of Sun Microsystems, Inc. Oracle is aregistered trademark and Oracle7 is a trademark of Oracle Corporation. POSIX is a registeredtrademark of the Institute of Electrical and Electronics. PostScript is a registered trademark of AdobeSystems, Inc. UNIX is a registered trademark licensed exclusively through X/Open Company Ltd. XWindow System is a trademark of the Massachusetts Institute of Technology. Intel, Pentium, and IntelInside are registered trademarks and Xeon is a trademark of Intel Corporation.
UNIX is a trademark in the US and other countries, licensed exclusively through X-Open CompanyLtd.
AdvFS Internals and TroubleshootingCourse GuideJanuary 2000
Contents
About This Course
About This Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivIntroduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivCourse Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivPlace in Curriculum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivTarget Audience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xivPrerequisites. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvCourse Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvNongoals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Taking This Course. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviCourse Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviCourse Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviChapter Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiTime Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiiCourse Conventions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiiResources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xix
1 Advanced File System Concepts
About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2
Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3File Domains and Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3AdvFS Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3AdvFS Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-4Filesets and Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-5Volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-6Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-7
Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Extent Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-8Displaying Extents Using the showfile Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-9
Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Why Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-12Logging a Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-13AdvFS Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-13
Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Cloning a Fileset Using clonefset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15
iii
Fileset Clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-15Cloning Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-16
Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17File Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-17
Using Trashcans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19Overview of Trashcans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-19
Reviewing AdvFS Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21File Domain Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21Fileset Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21File Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-21
Examining AdvFS Architecture and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22AdvFS Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-22AdvFS Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-24AdvFS in Tru64 UNIX V5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-25
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-26Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Using Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Reviewing AdvFS Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27Examining AdvFS Architecture and Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-27
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-29Using Traschcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-30Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-30
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-31Introducing AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-31Using Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-33Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-36Using Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-39Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-40
2 AdvFS On-Disk Structures
About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2
Introducing AdvFS On-Disk Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3Two-Level Implementation of AdvFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3.tags Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4BAS On-Disk Format: Everything is a Bitfile. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-5Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-7Mcells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-7AdvFS File Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-8
iv
Bitfile-Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-9Reusing Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-9
Describing BAS On-Disk Metadata Bitfiles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Domain and Volume Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Per Domain Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-11Per Volume Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-12Per Fileset Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-13Reserved Bitfile Special Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-15Metadata BitfileTags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-16.tags for Directory Entries for Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17Bitfile Metadata Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17Mcell Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-19Mcell Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20RBMT Page 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21BMT Page 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21BMT Page Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21Mcell Addresses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22Reserved Mcell Addresses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-22Mcell Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23Mcell Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-23Utilities for Viewing Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-24
Using Extent Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Extent Maps for Nonreserved Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Extent Maps for Reserved Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25Encoding of Extents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25
Using Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Tag File Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-27Tag File Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-29Tagmap Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-30Root Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Fileset Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Cloning through Fileset Tag File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31Utility for Viewing Tag Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-33UNIX Directories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35POSIX Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35AdvFS Tagfiles and Migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36
Assigning Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Bitfile. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37Fragment Header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38Fragment Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-39Fragments and Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-41
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . .2-43Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43Storage Bitmap Bitfile Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43SBM Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-44Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-47
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48Introducing AdvFS On-Disk Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48Describing BAS On-Disk Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49Using Extent Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49
v
Using Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50Assigning Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile . . . . . . . . . . . . . . . . . . . . . .2-50
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-65
3 AdvFS In-Memory Structures
About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2
Examining AdvFS In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Overview of In-Memory Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3Big Picture of Data Structure Linkage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-3
Checking the VFS Layer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5VFS Specific Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5vnode Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7mount Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7
Explaining the FAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9FAS Layer Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9In-Memory Per File Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9bfNode Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10fsContext Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11In-Memory Per Fileset Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13fileSetNode Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14Fileset Quota Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16User and Group Quota Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17
Locating the BAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18BAS Layer Structure Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18Access to BAS Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19bfAccess Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19Managing bfAccess Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-19bfSet Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20Finding bfSet Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20domain Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21Finding domain Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-21vd Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24
Defining Other In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27Free Space Cache. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27Bitfile Buffer Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-27I/O Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28FTX State Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Examining AdvFS In-Memory Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Checking the VFS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Explaining the FAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Locating the BAS Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-29Defining Other In-Memory Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31
vi
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-36
4 AdvFS System Calls and Kernel Interfaces
About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3
Describing Entries to AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4VFS Switch Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4vnode Switch Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5UBC Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6Device Driver Interface Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6AdvFS Lightweight Context Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7AdvFS I/O Completion Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7True AdvFS System Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Types of AdvFS System Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Domains and Volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11Miscellaneous Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12
Starting Up and Recovering in AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Startup and Recovery Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Mounting the File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13Activating the Bitfile-Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Activating the Domain and Searching for Virtual Disks. . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Activating the Domain: Full Activation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Recovering a Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Recovery Pass: Recovers Domain Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15
Providing Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16BAS-Level Storage Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16FAS-Level Storage Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16Truncating Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17
Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Creating a Clone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Writing to a Cloned Original . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19Reading from a Clone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19Deleting Bitfile from Cloned Original. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20Deleting a Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20Closing a Deleted Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20
Migrating Files and Deleting Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Migrating a Bitfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22Deleting a Fileset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22
Documenting Threads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23AdvFS Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23Fragment Bitfile Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-23I/O Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24AdvFS Cleanup Thread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25
vii
Describing Entries to AdvFS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Starting Up and Recovering in AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Providing Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26Migrating Files and Deleting Filesets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26Documenting Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-27Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28
5 Troubleshooting AdvFS
About This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2Case Study Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2
Describing AdvFS Troubleshooting Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-3AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-3Troubleshooting Tips and Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-4
Troubleshooting File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Recognizing File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8Causes of AdvFS Corruption. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8No Valid File System Error Message . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9Mount File System Operation Crashes the System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9Localized Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-10Generalized Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-11Domain Panic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-12
Resolving Known AdvFS Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Log Half-Full Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Fixing Log Half-Full Problems: Reducing Fragmentation . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Determining Appropriate Log Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Fixing Log Half Full Problems: Increasing Log Size Using switchlog. . . . . . . . . . . . . . . . .5-15Fixing Log Half Full Problems: Increasing Log Size Using mkfdmn. . . . . . . . . . . . . . . . . .5-16BMT Exhaustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16Avoiding BMT Exhaustion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16BMT Extent Map Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-18BMT Exhaustion: Fixing the Problem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-18
Case Study 1: RBMT Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Problem Statement: Case Study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Configuration: Case Study 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Problem Description: Case Study 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20
Case Study 2: Fragment-Free List Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Problem Statement: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Configuration: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Problem Description: Case Study 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-30Things Attempted: Case Study 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33Final Solution/Summary: Case Study 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33
Case Study 3: Corruption and System Panic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34
viii
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Problem Statement: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Configuration: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Problem Description: Case Study 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35Things Attempted: Case Study 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48Final Solution/Summary: Case Study 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48
Using the salvage Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50What is salvage? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50salvage Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52When to Use salvage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53Using salvage in Conjunction with Backup Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53Using salvage in the Absence of Backup Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-54Using salvage in the Case of Very Large Domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-54Using salvage in the Case of Massive Metadata Corruption. . . . . . . . . . . . . . . . . . . . . . . . .5-54
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Troubleshooting File System Corruption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Resolving Known AdvFS Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-55Performing Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-56
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-58Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-58
Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-59Describing AdvFS Troubleshooting Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-59
A AdvFS Commands and Utilities
About This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3Commands in Certain Versions of Tru64 UNIX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3addvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3DIGITAL UNIX V4.x Specific Information for addvol( -x and -p will be retired) . . . . . . . A-5advfsstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6advscan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9chfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10chfsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12chvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13defragment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14logread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16migrate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16mkfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18DIGITAL UNIX V4.x Specific mkfdmn Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19mkfset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-20mountlist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-21ncheck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-21nvbmtpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-22nvfragpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-25nvlogpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-27nvtagpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-29rmfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-32
ix
rmfset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-32rmvol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-33salvage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-34savemeta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-36shblk. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37shfragbf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-37showfdmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-38showfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-39showfsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-41stripe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42switchlog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-42tag2name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-43vbmtchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44vbmtpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-44vdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-45vdump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-46verify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-48vfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50vfilepg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-50vfragpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vlogpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vlsnpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-53vrestore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-54vsbmpg. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-57vtagpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-59
x
xi
Tables
0-1 Course Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii0-2 Course Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii1-1 Trashcan Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-192-1 Metadata Bitfile Tags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-172-2 .tags for Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-175-1 AdvFS Commands and Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35-2 Log Size Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-145-3 BMT Extent Map Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-185-4 salvage Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-50
xii
Figures
0-1 Course Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii1-1 Two Filesets Drawing Storage from a Domain Containing Three Volumes . . . . . . . . . . . . . . . .1-51-2 Filesets are not Necessarily Related to Partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-61-3 Extent-Based Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-91-4 Event Sequence for Logging a Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-131-5 Fileset Clones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-161-6 File Striping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-181-7 Trashcans. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-201-8 File Access: The Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-231-9 AdvFS Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-242-1 Two-Level Implementation of AdvFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42-2 Using AdvFS Metadata to Translate FAS to BAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-62-3 BAS On-Disk Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-72-4 Tag Number to BMT mcell to Logical Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-82-5 BAS On-Disk Metadata Bitfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-112-6 Fileset Tag Directory Locating Primary Mcell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-142-7 Reserved Bitfiles On Disk Layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-192-8 Mcell Page Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-202-9 File Access Through Tag File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-282-10 Tag File Allowing Transparent Data Move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-292-11 Fileset Tag File Before and After Cloning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-322-12 Clone Structures After Data Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-332-13 Relationship to POSIX Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-362-14 Fragment Bitfile Locating Fragment Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-383-1 Big Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-43-2 In-Memory Per File Structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-133-3 In-Memory Per Fileset Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-16
About This Course
xiii
About This Course
in this
will
apter
t this
ourse
and
About This Course
IntroductionThis section describes the contents of the course, suggests ways in which you can most effectively use the materials, and sets up the conventions for the use of terms in the course. It includes:
• Course description — a brief overview of the course contents
• Target audience — who should take this course
• Prerequisites — the skills and knowledge needed to ensure your success course
• Course goals and nongoals — what skills or knowledge the course will andnot provide
• Course organization — the structure of the course
• Course map — the sequence in which you should take each chapter
• Chapter descriptions — brief descriptions of each chapter
• Time schedule — an estimate of the amount of time needed to cover the chmaterial and lab exercises
• Course conventions — explanation of symbols and signs used throughoucourse
• Resources — manuals and books to help you successfully complete this c
Course DescriptionThis lecture-lab course focuses on the internals of the Advanced File System(AdvFS). Practical troubleshooting training on AdvFS is also presented.
Place in CurriculumThis course is part of the UNIX advanced curriculum for system administrationsupport personnel.
Target AudienceThis course is designed for system administrators and support engineers whoservice or support AdvFS configurations.
xiv
About This Course
PrerequisitesTo get the most from this course, you should be able to:
• Install and manage a Tru64 UNIX system.
• Install layered products and register license PAKs.
• Troubleshoot the operating system and make adjustments to improve performance.
• Manage traditional UNIX disk partitions.
• Perform typical UNIX system management tasks.
• Use the Tru64 UNIX kernel configuration tools.
• Set up and manage the LSM, AdvFS, and hardware Redundant Arrays of Independent Disks (RAID) using the command line or graphical user interface.
These prerequisites can be satisfied by taking the following courses:
• Tru64 UNIX System Administration lecture-lab or self-paced
• AdvFS, LSM, and RAID Configuration and Management
Course GoalsTo support customers with complex configurations using AdvFS, you should be able to:
• Describe AdvFS internals (architecture, on-disk structures, in-memory structures, algorithms, functions/commands).
• Troubleshoot AdvFS problems.
NongoalsThis course does not cover the following topics:
• RAID hardware and software concepts
• Hierarchical Storage Operating Firmware (HSOF) architecture and design
• Hardware RAID, LSM, and AdvFS configuration and management
• Hardware installation/maintenance and troubleshooting
• Tru64 UNIX operating system internals
xv
Taking This Course
Taking This Course
Course OrganizationThis Course Guide is divided into chapters designed to cover a skill or related group of skills required to fulfill the course goals. Illustrations are used to present conceptual material. Examples are provided to demonstrate concepts and commands.
In this course, each chapter consists of:
• An introduction to the subject matter of the chapter.
• One or more objectives that describe the goals of the chapter.
• A list of resources, or materials for further reference. Some of these manuals are included with your course materials. Others may be available for reference in your classroom or lab.
• The text of each chapter, which includes outlines, tables, figures, and examples.
• The summary highlights the main points presented in the chapter.
• The exercises enable you to practice your skills and measure your mastery of the information learned during the course.
Course MapThe Course Map shows how each chapter is related to other chapters and to the course as a whole. Before studying a chapter, you should master all of its prerequisite chapters. The prerequisite chapters are depicted before the following chapters on the Course Map. The direction of the arrows determines the order in which the chapters should be covered.
xvi
Taking This Course
h
It
Figure 0-1: Course Map
Chapter DescriptionsA brief description of each chapter is listed below.
• Advanced File System Concepts — an overview of the Advanced File System(AdvFS). It includes terminology such as logging, clones, striping and trascans, as well as an overview of AdvFS commands.
• AdvFS On-Disk Structures — information about AdvFS on-disk structures. includes a review of BAS on-disk metadata bitfiles, extent maps, tags, fragments and storage bitmap and miscellaneous bitfiles.
• AdvFS In-Memory Structures — information about AdvFS in-memory structures, including VFS, FAS and BAS layers.
• AdvFS System Calls and Kernel Interfaces — an overview of the entries intoAdvFS. It follows startup and recovery, storage management, cloning, filemigration and threads.
Advanced FileSystem Concepts
AdvFS In-Memory Structures
AdvFS On-Disk Structures
Troubleshooting AdvFS
advfsi01
AdvFS System Callsand Kernel Interfaces
xvii
Taking This Course
round
• Troubleshooting AdvFS — AdvFS troubleshooting tips and case studies.
Time ScheduleThe amount of time required for this course depends on each student's backgknowledge, experience, and interest in the various topics.
Use the following table as a guideline.
Course ConventionsThis book uses the following conventions.
Table 0-1: Course Schedule
Day Chapter/Appendix Number
Chapter/Appendix Name Lecture/ Reading Hours
Lab/Exercise Hours
1 1 Advanced File System Concepts 1 hours
2 AdvFS On-Disk Structures 2 hour 1 hour
A AdvFS Commands and Utilities Appendix
1 hour
2 3 AdvFS In-Memory Structures 2 hours 2 hours
4 AdvFS System Calls and Kernel Interfaces
1 hour 1 hour
5 Troubleshooing AdvFS 1 hour
Table 0-2: Course Conventions
Convention Description
keyword Keywords (for emphasis) and websites are displayed in this typeface.
example Examples, commands, options, and pathnames are displayed in this typeface.
command(n) Cross-references to command documentation include the section number in the reference pages. For example, fstab(5) means fstab is referenced in Section 5.
$ A dollar sign represents the user prompt.
# A number sign represents the superuser prompt.
[key] This symbol indicates that the named key on the keyboard is pressed.
.
.
.
In examples, a vertical ellipsis indicates that not all lines in the example are shown.
[ ] In syntax descriptions, brackets indicate items that are optional.
variable Italics indicate new terms as well as items that are variable (in syntax descriptions).
xviii
Taking This Course
ResourcesFor more information on the topics in this course, see the following:
• Tru64 UNIX AdvFS Reference Pages
• POLYCENTER Advanced File System and Utilities for DIGTIAL UNIX; Guide to File System Administration
• Tru64 UNIX System Configuration and Tuning
xix
Taking This Course
xx
1
Advanced File System Concepts
Advanced File System Concepts 1-1
About This Chapter
About This Chapter
IntroductionThis chapter presents an overview of the features of the Advanced File System (AdvFS). It prepares students for the examination of the internal support for the concepts reviewed here.
ObjectivesTo describe AdvFS at an introductory level, you should be able to:
• Define the terms file domains, filesets, and volumes.
• Describe extent-based storage.
• Describe logging and the benefits of transactions.
• Describe at an advanced level: clones, file striping, and trashcan directories.
• Describe the AdvFS architecture and on-disk format at a high level.
ResourcesFor more information on topics in this chapter as well as related topics, see the following:
• Advanced File System Administration (Tru64 UNIX Version 4.0f or higher)
• AdvFS Reference pages
1-2 Advanced File System Concepts
Introducing AdvFS
e
e a
Introducing AdvFS
OverviewTo study the internals of AdvFS, the basic concepts must be understood. This section reviews the following AdvFS terms and concepts:
• File domains and filesets
• AdvFS characteristics
• AdvFS capabilities
• Filesets and partitions
• Volumes
• Filesets
File Domains and FilesetsFilesets and file domains are distinguishing components of AdvFS. Filesets are similar to mountable file systems. A file domain represents the pool of storage from which the filesets allocate their storage space. The term volume represents the actual storage entity within a domain.
• File domain is a named set of one or more volumes that provides a shared pool of physical storage.
• Volume is any mechanism that behaves like a UNIX block device.
— An entire disk
— A disk partition
— A logical volume configured with the Logical Storage Manager (LSM)
• Fileset represents a portion of the directory hierarchy.
— Follows the logical structure of a traditional UNIX file system.
— Hierarchy of directory names and file names. It's what you mount.
AdvFS CharacteristicsThe “pools of storage” called domains within AdvFS are characteristics that makAdvFS an advanced file system. Most other file systems lack the ability to drawstorage from a pool shared among multiple filesets.
AdvFS goes beyond UFS by allowing you to create multiple filesets that sharcommon pool of storage within a defined file domain.
Advanced File System Concepts 1-3
Introducing AdvFS
ding
sical
offer
ginal.
t that
ree
A fileset is similar to a file system in the following ways:
• You can mount filesets like you can mount file systems.
• Filesets can have quotas enabled.
• Filesets can be backed up.
AdvFS separates the directory layer from the storage layer. It allows management of the physical storage separately from the directory hierarchy. The directory hierarchy handles file naming and the file system interface – opening and reafiles.
The physical storage layer handles write-ahead logging, file allocation, and phydisk I/O functions. It can move a file from one disk to another within a storagedomain without changing its pathname.
AdvFS CapabilitiesSome special capabilities are available within AdvFS, such as filesets, which features not provided by other file systems:
• You can clone a fileset and back it up while users are still accessing the ori
• A fileset can span several disks (volumes) in a file domain.
The most basic advancement provided by AdvFS is the ability to create a filesecan span multiple volumes.
The following figure depicts two filesets that draw their disk storage from the thvolumes within the domain.
1-4 Advanced File System Concepts
Introducing AdvFS
Figure 1-1: Two Filesets Drawing Storage from a Domain Containing Three Volumes
Filesets and PartitionsEach fileset is a uniquely named set of directories and files that form a subtree structure.
The following figure distinguishes a fileset from a partition.
Domain with 3 volumes
Fileset A Fileset B
Advanced File System Concepts 1-5
Introducing AdvFS
ses links
Figure 1-2: Filesets are not Necessarily Related to Partitions
Commands associated with domains, volumes and filesets are: mkfdmn, mkfset, addvol, rmvol, showfdmn, showfsets.
VolumesVolumes represent the basic storage building block for AdvFS. They are sometimes referred to as virtual disks because they function as a disk would in less sophisticated file systems.
An AdvFS volume is:
• A physical storage building block for a file domain.
• Any logical UNIX block device:
— "Real" disk partition
— Hardware RAID logical disk
— LSM volume
• Administered from /etc/fdmns.
Note that the contents of /etc/fdmns should not be changed manually. Any changes should be introduced using AdvFS utilities (such as addvol, rmvol, mkfdmn, and so forth).
The following example shows a symbolic link in a directory under the /etc/fdmns directory pointing to the volume (actual disk storage) that compothis domain. If the domain had more than one volume, there would be multiple shown. Note that there are directories under /etc/fdmns for each domain.
Filesets
File Domain Volumes (Disk Partitions)
Filesets != Partitions
1-6 Advanced File System Concepts
Introducing AdvFS
Example 1-1: Displaying a Directory Under /etc/fdmns
# ls -l /etc/fdmns/usr_domaintotal 0lrwxr-xr-x 1 root system 15 Mar 17 17:56 dsk2g -> /dev/disk/dsk2g
FilesetsFilesets are the mountable entities within AdvFS. They function similarly to UFS file systems.
A fileset is:
• A file or directory tree mapped to a domain.
• Created using the command mkfset or through dxadvfs.
• Mounted like a file system.
• Administered from /etc/fstab file.
The following example shows a simple /etc/fstab file with the last line representing a request to mount the usr fileset, which is within the usr_domain domain on the /usr mount point.
Example 1-2: Mounting Through /etc/fstab
# cat /etc/fstab/dev/disk/dsk2a / ufs rw 1 1/proc /proc procfs rw 0 0usr_domain#usr /usr advfs rw 0 2
Advanced File System Concepts 1-7
Using Extent-Based Storage
Using Extent-Based Storage
OverviewAdvFS uses an extent-based storage system to store the data within a file. An extent-based strategy allows a contiguous file to be located using a single extent.
This section introduces:
• Extent concepts
• Using the showfile command to display extents
Extent ConceptsAny attempt to create a file with content involves allocating some disk space from the volumes within the domain. These chunks of disk space are referred to as file extents.
AdvFS always attempts to write each file to disk as a set of contiguous pages called an extent.When a file consists of a few large extents and file access is sequential,I/O performance should be optimal.
An extent map translates the bitfile pages (8192 bytes each) to disk blocks (512 bytes each). The AdvFS storage allocation policy adds pages to a file by preallocating one-fourth of the file size up to 16 pages each time the file is appended. This fosters larger extents. When the file is closed, excess preallocated space is truncated, so space is not wasted. When a file uses only part of the last page, a file fragment is created.
Rather than wasting the rest of the page in the extent, the space is allocated from a special file, the fileset fragment file. Each fileset has a frag file, containing seven groups of fragments, from 1Kb to 7Kb in size. A fragment is allocated from the appropriately sized group.
The following figure shows the relationship between the logical file, the extent map, and the actual disk space.
1-8 Advanced File System Concepts
Using Extent-Based Storage
Figure 1-3: Extent-Based Storage
Displaying Extents Using the showfile CommandThe showfile command is one of the most heavily used in AdvFS. Use showfile to view AdvFS details pertaining to an individual file.
The showfile command displays the extent map of each file. An extent is a contiguous area of disk space that the file system allocates to a file.
• Simple files have one extent map.
• Striped files have an extent map for every stripe segment.
The following example shows a file with a single extent of three pages in size.
logical fileextent 1 extent 2
Extent Map
Disk Space
extent 1 extent 2
Advanced File System Concepts 1-9
Using Extent-Based Storage
Example 1-3: Using showfile to Display a Contiguous File
# showfile -x /usr/users/obrien/disktab
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File 596b.8001 1 16 3 simple ** ** async 100% disktab
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 576496 48 extentCnt: 1#
XtntType is the extent type, which can be:
• simple, a regular AdvFS file without special extents.
• stripe, a striped file.
• symlink, a symbolic link to a file (ufs, nfsv3, and so on).
The following example shows an empty striped file with two stripes.
Example 1-4: Using showfile to Display a Striped File with Two Stripes
# showfile -x /usr/dennis/stripe1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File c.8001 2 16 0 stripe 2 8 async 100% stripe1
extentMap: 1 pageOff pageCnt volIndex volBlock blockCnt extentCnt: 0
extentMap: 2 pageOff pageCnt volIndex volBlock blockCnt extentCnt: 0
The showfile command cannot display attributes for symbolic links or non-AdvFS files.
This example shows the limitations of showfile when used on a UFS file.
1-10 Advanced File System Concepts
Using Extent-Based Storage
Example 1-5: showfile Command Output from a UFS File
# showfile -x /vmunix
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File ** ** ** ** ufs ** ** ** ** vmunix
A simple file has one extent map while a striped file has more than one extent map.
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 576496 48 extentCnt: 1
An extent map displays the following information:
pageOff Starting page number of the extent
pageCnt Number of 8K pages in the extent
vol Number indicating which volume within the domain contains this file
volBlock Starting block number of the extent
blockCnt Number of blocks in the extent
extentCnt Number of extents
Advanced File System Concepts 1-11
Logging
Logging
OverviewA characteristic of AdvFS is that the recovery time after a power failure or crash is minimal. This section introduces AdvFS logging.
• Why logging
• Logging a transaction
• AdvFS logging
Why LoggingAnother distinguishing characteristic of AdvFS is metadata logging. AdvFS tracks alterations to on-disk metadata by logging the transaction as it occurs. Many file system operations involve several widely separated writes to disk.
A transaction usually consists of more than one write. A crash in between the writes leaves the on-disk file system inconsistent.
The two main benefits of using transactions and logging are:
• Fast crash recovery
This is achieved by using the transaction log to redo committed transactions and undo uncommitted transactions.
The log has a fixed size regardless of the domain size so this bounds the time for recovery.
This is in contrast to UFS which relies on fsck to repair the file system (the time to run fsck is proportional to the number of files in a file system, so as disks get bigger, fsck takes longer).
On average AdvFS crash recovery takes about 10 to 15 seconds.
• Improved performance for metadata-intensive operations
Since all metadata modifications are first written to the log and can be recreated using just the log, the file system can write the actual modifications to disk at a later time.
This allows the file system to wait until it can do bigger I/Os by collecting many unrelated metadata modifications into fewer I/Os.
This is in contrast to UFS which relies on ordered synchronous writes to maintain metadata consistency. The writes are ordered such that fsck can easily repair inconsistencies.
1-12 Advanced File System Concepts
Logging
Logging a TransactionThe following figure shows the event sequence when a transaction is logged.
Figure 1-4: Event Sequence for Logging a Transaction
å Storage is allocated in the bitfile metadata table (BMT) (log record 1).
� The bitfile tag slot is allocated (log record 2).
ê The directory entry is changed (log record 3).
� The transaction is committed (log record 4).
� The buffered log records are written to disk.
ñ The buffered bitfile pages are written to disk and the log pages are removed.
AdvFS LoggingAdvFS logging consists of:
• Modifications to its own metadata (internal structures)
• Not user file data (unless atomic write data logging has been enabled using chfile -L)
For each transaction, AdvFS:
• Writes a series of log records describing all changes for an operation to disk.
• Performs changes (writes changed blocks to disk).
In case of crash, on restart, the on-disk log indicates which transactions are complete.
Tag Directory
"log"tagN
Directory
1 2 3 Commit 4 5
6
1
2
3
Log
intentions commit record
Advanced File System Concepts 1-13
Logging
The transaction log records changes to metadata (bitfile and directory data). For example, file creation requires modifying more than one on-disk structure:
• File directory to insert a new file name
• Fileset tag directory to allocate the new file’s tag
• Bitfile table to allocate an entry for the new file
When a file is created, these structures must all be updated.
The changes are first written to a log and then to disk. If the creation process is interrupted by a system crash, the creation process can be completed or undone based on the log.
If the log information is complete, finish the file creation on disk. If the log information is incomplete, undo the file creation on disk. This leaves the file system in a consistent state.
During crash recovery, the log is read to determine what changes must be completed or undone. Since the log has a limited size, the recovery time is bounded by the amount of time it takes to process the log. A larger domain will take no longer to recover. In practice, crash recovery takes less than 10 seconds.
1-14 Advanced File System Concepts
Cloning
n will
t
tion
t a ginal
Cloning
OverviewTo allow the uninterrupted use of AdvFS data while maintenance operations are in progress, a temporary “clone” of the file system can be requested. This sectiointroduce cloning concepts.
• Cloning a fileset using the clonefset command
• Fileset clones
• Cloning issues
Cloning a Fileset Using clonefset Another distinguishing capability within AdvFS is the ability to create a virtualclone of the fileset. Note that the clone is not an actual copy of the data.
Cloning a fileset involves these steps:
1. Locking the master (original) fileset
2. Creating the clone fileset
3. Copying the tag directory of the master to the clone
4. Incrementing the clone count in the master fileset
5. Setting the clone’s cloneID = clone count in the master
Handling a write to the master involves these steps:
1. Creating a bitfile for the file in the clone fileset if the cloned bitfile does noalready exist in the clone fileset
2. Modifying the clone fileset’s tag directory to reference the new file
3. Allocating an extent in the new file for the portion being written
4. Copying the original data to the new extent
5. Allowing the write occur to the file in the master
If the cloned bitfile already exists, it does not include an extent for the porbeing written.
Fileset ClonesA clone fileset is a read-only copy of a fileset created to capture fileset data aparticular time. The contents of the clone fileset can be backed up while the orifileset remains available to users.
The following figure shows cloning actions including copy-on-write (COW).
Advanced File System Concepts 1-15
Cloning
Figure 1-5: Fileset Clones
Cloning IssuesConsider these issues when working with cloned filesets:
• Applications should not be writing to the master when the clone is created.
Fortunately cloning time is very fast (seconds) due to copy-on-write.
• A clone is not a backup; it is a tool for minimizing down time for a fileset due to backups:
— Create clone of fileset
— Back up from clone
— Delete clone
Domain
Application
write
Backup tool
read
COWread
after clone is created, before any writes
first write to a block in the original (master) fileset
access to COW write blocks in the cloned fileset
1-16 Advanced File System Concepts
Striping
Striping
OverviewAdvFS provides file striping to potentially enhance the performance of file I/O intensive applications. This section introduces the concept of file striping.
File StripingAdvFS provides file-level striping to help spread the disk I/O over several volumes. The stripe utility directs a zero-length file (a file with no data written to it) to be spread evenly across several volumes within a file domain.
As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.
Existing, nonzero-length files cannot be striped using the stripe utility.
To stripe an existing file, create a new file, use the stripe utility to stripe the new file, and copy the contents of the file you want to stripe into the new striped file. After copying the file, delete the nonstriped file.
Once a file is striped, you cannot use the stripe utility to modify the number of disks that a striped file crosses. To change the volume count of a striped file, you can create a second file with a new volume count, and then copy the contents of the first file into the second file. After copying the file, delete the first file.
The following figure depicts the blocks of a file (1,2,...) that is striped over three volumes. Note that block one is on the first volume, block two is on the second volume, and so forth.
Advanced File System Concepts 1-17
Striping
Figure 1-6: File Striping
DomainFile
12
345..
1-18 Advanced File System Concepts
Using Trashcans
e tory copy
me cally tory
tory.
ing
is a the
Using Trashcans
OverviewAn advanced file system should provide some user-level comforts as well as administrator and application-level features. This section introduces the notion of a “trashcan” directory from which deleted files can be retrieved.
Overview of TrashcansThe trashcan component within AdvFS allows administrators to prepare for thinadvertent removal of files. The deleted files are moved to the trashcan direcin case the user wants them back.You can configure your systems to retain aof deleted files.
Trashcan directories can be attached to one or more directories within the safileset. Once attached, any file deleted from an attached directory is automatimoved to the trashcan directory. The last version of a file deleted from a direcwith a trashcan attached can be returned to the original directory with the mv command.
Root-user privilege is not required to retrieve files from a trashcan directory.
Restrictions include:
• You can restore only the most recently deleted version of a file.
• You can attach more than one directory to the same trashcan directory; however, if you delete files with identical file names from the attached directories, only the most recently deleted file remains in the trashcan direcFiles deleted from the trashcan directory are unrecoverable.
The following table lists and defines the commands for setting up and managtrashcans.
The following picture depicts a standard directory hierarchy on the left. If theretrashcan directory associated with the fileset, any removed files are placed intrashcan. If necessary, the files can be moved from the trashcan back to the directory.
Table 1-1: Trashcan Commands
Command Function
mktrashcan Creates the trashcan
shtrashcan Shows the contents of the trashcan
rmtrashcan Removes the trashcan directory
Advanced File System Concepts 1-19
Using Trashcans
Figure 1-7: Trashcans
Trashcan Dir
rm
mv
1-20 Advanced File System Concepts
Reviewing AdvFS Commands
Reviewing AdvFS Commands
Overview This section reviews some of the commonly used AdvFS commands.
• File domain commands
• Fileset commands
• File commands
File Domain CommandsSome file domain commands are shown here.
Fileset CommandsSome fileset commands are shown here.
File CommandsThese are a few file commands.
Command Function
mkfdmn Creates a file domain
addvol Adds a new volume to the domain
rmvol Removes a volume from the domain
balance Distributes storage over the volumes evenly
defragment Makes files contiguous if possible
Command Function
mkfset Creates a fileset
chfsets Changes fileset characteristics
clonefset Creates a fileset clone
Command Function
migrate Moves a file from one volume to another
stripe Creates an empty striped file
mktrashcan Creates a trashcan directory
Advanced File System Concepts 1-21
Examining AdvFS Architecture and Components
tics
.
d and , and
ith a al
Examining AdvFS Architecture and Components
OverviewThis section examines how AdvFS is put together. The concepts are necessary to understand the internals of AdvFS.
• AdvFS architecture
• AdvFS components
• AdvFS in Tru64 UNIX V5
AdvFS ArchitectureAdvFS includes two kernel subsystems:
• File access subsystem (FAS)
— Emulates UNIX file system (UFS) and POSIX file and directory seman
— Uses bitfiles to implement files and directories
• Bitfile access subsystem (BAS)
— A bitfile is an array of 8K pages named via a tag.
— A tag is a unique identifier within a domain, similar to an inode number
The bitfile access subsystem manipulates bitfiles: create, open, read, write, adremove storage. It also interfaces with buffer cache, Volume Manager interfaceI/O scheduling. BAS provides:
• Transaction and log management
• Storage placement and management
• Domain and fileset management
The following figure depicts the software components that may be involved wtypical disk I/O. Note the AdvFS software component in the middle. The VirtuFile System (VFS) software directs the processing toward the appropriate filesystem specific software.
1-22 Advanced File System Concepts
Examining AdvFS Architecture and Components
Figure 1-8: File Access: The Big Picture
Once VFS directs the I/O processing to the AdvFS software, the AdvFS processing can be thought of as having two levels, FAS and BAS, as shown in the following figure.
user mode
kernel mode
Application
issues system calls( open(), close(), read(), write() )
Logical Storage ManagerLSM
Pseudo disk-driver
UNIX File SystemUFS
Uses inode structures to representfiles.
Advanced File SystemAdvFS
Uses bfNode structures to representfiles.
Network File SystemNFS
Uses rnode structures to representfiles.
Disk Driver
Virtual File SystemVFS
Uses vnode structure to representfiles from any file system.
Advanced File System Concepts 1-23
Examining AdvFS Architecture and Components
Figure 1-9: AdvFS Architecture Overview
AdvFS Components AdvFS consists of these components:
• File access subsystem - the POSIX file system layer in AdvFS
— Translates VFS file system requests into BAS requests
— Components:
* Mount, unmount, initialization
* Directory operations (lookup, create, delete)
— File operations (create, read, write, stat, delete, rename)
• Bitfile access subsystem - the bitfile layer in AdvFS
— Components:
* Domain operations (create, delete, open, close)
* Bitfile set operations (create, delete, clone, open, close)
* Bitfile operations (create, delete, open, close, migrate, read, write, add and remove stg)
* Transactions management operations (start, stop,fail, pin pg, pin record, lock, recover)
* Buffer cache operations (pin and unpin page, ref and deref page, flush bitfile, flush cache, prefetch pages, I/O queuing)
* Volume operations (add, remove)
VFS
Block Device Interface
File Access Subsystem (FAS)
Bitfile Access Subsystem (BAS)
VFS operationsvnode operations
Domains and VolumesBitfilesTransaction Management
1-24 Advanced File System Concepts
Examining AdvFS Architecture and Components
tem
The term bitfile refers to a generic file as is supported by the BAS. Files in the FAS are simply bitfiles to which the FAS applies POSIX semantics. Therefore, files are instantiated via bitfiles, and in general, file and bitfile are equivalent.
AdvFS in Tru64 UNIX V5Many changes are included in the latest release of Tru64 UNIX (V5).
Version five of Tru64 UNIX has a new version of the on-disk structure of AdvFS. The previous version of the AdvFS on-disk structure was V3. In Tru64 UNIX V5.0, the AdvFS on-disk structure will be at version four.
Additional features include faster directory searches for directories larger than 8K.
AdvFS has added some additional support for very large directories. The performance improvements include the creation of a B-tree index supporting directories greater than 8K in size. This dramatically improves file creation and deletion performance. Improvement becomes more noticeable when the directory contains more than ~2500 files.
Quota limits are now held in 8-byte fields yielding higher limits.
• Removal of metadata limitations (such as BMT page 0 restrictions)
• Direct I/O allowing I/O direct to the application’s address space (no UBC buffering)
• Smooth sync() operations to eliminate the update daemon 30-second sysI/O bursts
• SMP improvements
Advanced File System Concepts 1-25
Summary
Summary
Introducing AdvFSA file domain is a named set of one or more volumes that provides a shared pool of physical storage.
A fileset represents a portion of the directory hierarchy. Each fileset is a uniquely named set of directories and files that form a subtree structure.
A volume is any mechanism that behaves like a UNIX block device.
Using Extent-Based StorageThe Advanced File System always attempts to write each file to disk as a set of contiguous pages. The set of contiguous pages is called an extent. An extent map translates the bitfile pages (8192 bytes each) to disk blocks (512 bytes each).
The AdvFS storage allocation policy adds pages to a file by preallocating one-fourth of the file size up to 16 pages each time the file is appended. This fosters larger extents. When the file is closed, excess preallocated space is truncated, so space is not wasted.
When a file uses only part of the last page, a file fragment is created. Rather than wasting the rest of the page in the extent, the space is allocated from a special file, the fileset frag file. Each fileset has a frag file, containing seven groups of fragments, from 1Kb to 7Kb in size. A fragment is allocated from the appropriately sized group.
LoggingFast crash recovery is achieved by using the transaction log to redo committed transactions and undo uncommitted transactions. The log has a fixed size regardless of the domain size, so this bounds the time for recovery.
This is in contrast to UFS which relies on fsck to repair the file system. The time to run fsck is proportional to the number of files in a file system, so as disks get bigger, fsck takes longer. On average AdvFS crash recovery takes about 10 to 15 seconds.
Transaction logging also improves performance for metadata-intensive operations. Since all metadata modifications are first written to the log and can be recreated using just the log, the file system can write the actual modifications to disk at a later time. This allows the file system to wait until it can do bigger I/Os by collecting many unrelated metadata modifications into fewer I/Os.
This is in contrast to UFS which relies on ordered synchronous writes to maintain metadata consistency. The writes are ordered such that fsck can easily repair inconsistencies.
1-26 Advanced File System Concepts
Summary
CloningA clone fileset is a read-only copy of a fileset created to capture fileset data at a particular time. The contents of the clone fileset can be backed up while the original fileset remains available to users.
StripingThe stripe utility directs a zero-length file (a file with no data written to it) to be spread evenly across several volumes within a file domain. As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.
Using TrashcansYou can configure your systems to retain a copy of deleted files. Trashcan directories can be attached to one or more directories within the same fileset. Once attached, any file deleted from an attached directory is automatically moved to the trashcan directory. The last version of a file deleted from a directory with a trashcan attached can be returned to the original directory with the mv command.
Reviewing AdvFS CommandsSome file domain commands are shown here.
Examining AdvFS Architecture and ComponentsAdvFS includes two kernel subsystems:
• File access subsystem
— Emulates UFS and POSIX file and directory semantics
— Uses bitfiles to implement files and directories
Command Function
mkfdmn Creates a file domain
addvol Adds a new volume to the domain
rmvol Removes a volume from the domain
balance Distributes storage over the volumes evenly
defragement Makes files contiguous if possible
Advanced File System Concepts 1-27
Summary
ifier
• Bitfile access subsystem
— Manipulates bitfiles: create, open, read, write, add and remove storage
A bitfile is an array of 8K pages, named via a tag. A tag is a unique identwithin a domain, similar to an inode number.
— Interfaces with buffer cache, VM interface, and I/O scheduling
— Provides transaction and log management
— Provides storage placement and management
— Provides domain and fileset management.
1-28 Advanced File System Concepts
Exercises
Exercises
To successfully complete the following exercises, you must be able to perform the following tasks:
• Create an AdvFS file domain with multiple disks and filesets.
• Create an AdvFS clone fileset.
• Create an AdvFS striped file.
• Add and remove volumes to a file domain.
• Defragment a file domain.
• Balance a file domain.
• Add and change fileset attributes, in particular, fileset quotas.
• Use the showfdmn and showfsets commands to obtain information about an AdvFS file domain.
• Use the showfile command to obtain information about an AdvFS file.
• Recreate the AdvFS management structure contained in /etc/fdmns using mkdir and ln.
Introducing AdvFS If completely comfortable with AdvFS commands, skip this exercise set and move forward to Exercise Set 2.
1. Create a file domain using at least two volumes that contains at least two filesets. If you have only one disk, you may have to repartition it to get the two volumes.
2. Make mount points and mount the filesets.
3. Use df -t advfs to check on the available space for each fileset.
Using Extent-Based Storage1. Add another volume to the domain. Check available space.
2. Create some large files to take up some space in the filesets. How can a fileset be prevented from taking up all of the available storage in the domain?
Cloning1. Make a clone of one of your filesets. How long did it take to create?
2. Check the contents of the clone. Does it match the contents of the original?
3. Put a new file in the original fileset. Does it appear in the clone?
Advanced File System Concepts 1-29
Exercises
4. Try to add a new file to the clone. What happened? Explain.
Using Traschcans1. Delete a file from one of your filesets. Can you get it back?
2. Create a trashcan. Associate it with your fileset.
3. Delete a file from the fileset. Can you get it back?
Striping1. Create an empty striped file. Use the showfile -x command to view the
extents of the empty file.
2. Show the extents of one of your large files.
3. Copy the large file to the empty striped file.
4. Revisit the extents of the striped file. Is there any performance difference in reading the two files?
5. Use #time cat /mnt_pnt/big_file >/dev/null.
1-30 Advanced File System Concepts
Solutions
Solutions
Introducing AdvFS1. Create a file domain using at least two volumes that contains at least two
filesets. If you have only one disk, you may have to repartition it to get the two volumes.
#
# disklabel -r /dev/rdisk/dsk0c
# /dev/rdisk/dsk0c:
type: SCSI
disk: RZ26F
label:
flags:
bytes/sector: 512
sectors/track: 57
tracks/cylinder: 14
sectors/cylinder: 798
cylinders: 2570
sectors/unit: 2050860
rpm: 5400
interleave: 1
trackskew: 40
cylinderskew: 43
headswitch: 0 # milliseconds
track-to-track seek: 0 # milliseconds
drivedata: 0
8 partitions:
# size offset fstype [fsize bsize cpg] # NOTE: values not exact
a: 131072 0 unused 0 0 # (Cyl. 0 - 164*)
b: 262144 131072 unused 0 0 # (Cyl. 164*- 492*)
c: 2050860 0 unused 0 0 # (Cyl. 0 - 2569)
d: 552548 393216 unused 0 0 # (Cyl. 492*-
1185*)
e: 552548 945764 unused 0 0 # (Cyl. 1185*-
1877*)
f: 552548 1498312 unused 0 0 # (Cyl. 1877*- 2569)
g: 819200 393216 unused 0 0 # (Cyl. 492*-
1519*)
h: 838444 1212416 unused 0 0 # (Cyl. 1519*- 2569)
#
#
#
#
# mkfdmn /dev/disk/dsk0a bruden_dom
#
#
#
# addvol /dev/disk/dsk0b bruden_dom
#
# showfdmn bruden_dom
Advanced File System Concepts 1-31
Solutions
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
showfdmn: unable to display volume info; domain not active
#
#
#
#
# mkfset bruden_dom bruce_fset
#
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
showfdmn: unable to display volume info; domain not active
#
#
#
# mkdir /usr/bruce
# mkdir /usr/dennis
#
# mount bruden_dom#bruce_fset /usr/bruce
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 122432 7% on 256 256 /dev/disk/dsk0a
2 262144 261968 0% on 256 256 /dev/disk/dsk0b
---------- ---------- ------
393216 384400 2%
2. Make mount points and mount the filesets.
# mount bruden_dom#dennis_fset /usr/dennis
#
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 122432 7% on 256 256 /dev/disk/dsk0a
2 262144 261968 0% on 256 256 /dev/disk/dsk0b
---------- ---------- ------
393216 384400 2%
1-32 Advanced File System Concepts
Solutions
3. Use df -t advfs to check on the available space for each fileset.
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025604 294864 78% /usr
usr_domain#var 1426112 75678 294864 21% /var
bruden_dom#bruce_fset 393216 32 384400 1% /usr/bruce
bruden_dom#dennis_fset 393216 32 384400 1% /usr/dennis
#
Using Extent-Based Storage1. Add another volume to the domain. Check available space.
#
# addvol /dev/disk/dsk2h bruden_dom
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 122432 7% on 256 256 /dev/disk/dsk0a
2 262144 261968 0% on 256 256 /dev/disk/dsk0b
3 1858624 1858496 0% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2242896 0%
2. Create some large files to take up some space in the filesets. How can a fileset be prevented from taking up all of the available storage in the domain? Fileset quotas can be used to limit the disk space of a fileset.
# cp /vmunix /usr/bruce/big1
#
#
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025604 294864 78% /usr
usr_domain#var 1426112 75680 294864 21% /var
bruden_dom#bruce_fset 2251840 22784 2197392 2% /usr/bruce
bruden_dom#dennis_fset 2251840 22784 2197392 2% /usr/dennis
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 99680 24% on 256 256 /dev/disk/dsk0a
2 262144 239216 9% on 256 256 /dev/disk/dsk0b
3 1858624 1858496 0% on 256 256 /dev/disk/dsk2h
Advanced File System Concepts 1-33
Solutions
---------- ---------- ------
2251840 2197392 2%
#
# cp /vmunix /usr/bruce/big2
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 99680 24% on 256 256 /dev/disk/dsk0a
2 262144 239216 9% on 256 256 /dev/disk/dsk0b
3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2174640 3%
#
#
# cp /vmunix /usr/dennis/big2
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 76928 41% on 256 256 /dev/disk/dsk0a
2 262144 239216 9% on 256 256 /dev/disk/dsk0b
3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2151888 4%
#
#
# cp /vmunix /usr/dennis/big3
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 76928 41% on 256 256 /dev/disk/dsk0a
2 262144 216464 17% on 256 256 /dev/disk/dsk0b
3 1858624 1835744 1% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2129136 5%
#
#
# showfsets -q bruden_dom
Block (512) limits File limits
Fileset BF used soft hard grace used soft hard grace
bruce_fset -- 45536 0 0 4 0 0
dennis_fset -- 68288 0 0 5 0 0
#
#
# showfsets -q bruden_dom dennis_fset
1-34 Advanced File System Concepts
Solutions
Block (512) limits File limits
Fileset BF used soft hard grace used soft hard grace
dennis_fset -- 68288 0 0 5 0 0
#
#
#
# chfsets -B 25000 -b 30000 bruden_dom bruce_fset
bruce_fset
Id : 37f12c39.000263ea.1.8001
Block H Limit: 0 --> 30000
Block S Limit: 0 --> 25000
#
#
#
# showfsets -q bruden_dom
Block (512) limits File limits
Fileset BF used soft hard grace used soft hard grace
bruce_fset -- 45536 50000 60000 4 0 0
dennis_fset -- 68288 0 0 5 0 0
#
#
# showfsets -qk bruden_dom
Block ( 1k) limits File limits
Fileset BF used soft hard grace used soft hard grace
bruce_fset -- 22768 25000 30000 4 0 0
dennis_fset -- 34144 0 0 5 0 0
#
#
#
# cp /vmunix /usr/bruce/big4
#
# su obrien
$
$
$ cp /vmunix /usr/bruce/big5
cp: /usr/bruce/big5: Permission denied
$
$ ls -l / d /usr/bruce
drwxr-xr-x 3 root system 8192 Sep 28 17:21 /usr/bruce
$
$ su -
#
#
# chmod ugo+w /usr/bruce
# chmod ugo+w /usr/dennis
#
# su obrien
$
$
$ cp /vmunix /usr/bruce/big5
/usr/bruce: write failed, fileset disk limit reached
cp: /usr/bruce/big5: Disc quota exceeded
$
$
Advanced File System Concepts 1-35
Solutions
$ df -t /usr/bruce
Filesystem 512-blocks Used Available Capacity Mounted on
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
$
$ showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 76928 41% on 256 256 /dev/disk/dsk0a
2 262144 216464 17% on 256 256 /dev/disk/dsk0b
3 1858624 1812992 2% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2106384 6%
$
$
$ showfsets -q bruden_dom
Block (512) limits File limits
Fileset BF used soft hard grace used soft hard grace
bruce_fset *- 68288 50000 60000 none 6 0 0
dennis_fset -- 68288 0 0 5 0 0
$
$
$
$ cp /vmunix /usr/dennis/big5
$
$ df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025636 294816 78% /usr
usr_domain#var 1426112 75682 294816 21% /var
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
bruden_dom#dennis_fset 2251840 91040 2083632 5% /usr/dennis
$
Cloning1. Make a clone of one of your filesets. How long did it take to create?
Clones take very little time to create since no data is copied. Some metadata is created to represent the clone.
$
$ clonefset bruden_dom dennis_fset dennis_clone
Permission denied - user must be root to run clonefset.
usage: clonefset domain origSetName cloneSetName
$
$
$
$
#
# clonefset bruden_dom dennis_fset dennis_clone
#
1-36 Advanced File System Concepts
Solutions
2. Check the contents of the clone. Does it match the contents of the original?
3. Put a new file in the original fileset. Does it appear in the clone?
4. Try to add a new file to the clone. What happened?
Contents of the clone match the original fileset. New files will not appear in the clone since the clone is effectively a snapshot. A clone cannot be written to since the clone is a read-only, pseudo copy of the original fileset.
# mkdir /usr/den_clone
#
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 76928 41% on 256 256 /dev/disk/dsk0a
2 262144 193712 26% on 256 256 /dev/disk/dsk0b
3 1858624 1812864 2% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2083504 7%
#
# showfsets bruden_dom
bruce_fset
Id : 37f12c39.000263ea.1.8001
Files : 6, SLim= 0, HLim= 0
Blocks (512) : 68288, SLim= 50000, HLim= 60000 grc= none
Quota Status : user=off group=off
dennis_fset
Id : 37f12c39.000263ea.2.8001
Clone is : dennis_clone
Files : 6, SLim= 0, HLim= 0
Blocks (512) : 91040, SLim= 0, HLim= 0
Quota Status : user=off group=off
dennis_clone
Id : 37f12c39.000263ea.3.8001
Clone of : dennis_fset
Revision : 1
#
#
# mount bruden_dom#dennis_clone /usr/den_clone
#
#
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025638 294816 78% /usr
usr_domain#var 1426112 75686 294816 21% /var
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
Advanced File System Concepts 1-37
Solutions
bruden_dom#dennis_fset 2251840 91040 2083504 5% /usr/dennis
bruden_dom#dennis_clone 2251840 91040 2083504 5% /usr/den_clone
#
#
#
# ls -li /usr/dennis
total 45528
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
#
#
# ls -li /usr/den_clone
total 45528
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
#
#
#
# cat /etc/disktab > /usr/dennis/sm1
#
# ls -li /usr/dennis/sm1
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 /usr/dennis/sm1
#
# ls -li /usr/den_clone/sm1
ls: /usr/den_clone/sm1 not found
#
#
# cat /etc/disktab > /usr/den_clone/sm1
sh: /usr/den_clone/sm1: cannot create
#
#
#
# ls -li /usr/dennis
total 45559
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
9 -rwxr-xr-x 1 obrien system 11646960 Sep 28 17:26 big5
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1
1-38 Advanced File System Concepts
Solutions
Using Trashcans1. Delete a file from one of your filesets. Can you get it back?
2. Create a trashcan. Associate it with your fileset.
3. Delete a file from the fileset. Can you get it back?
You can retrieve deleted files only if you have a trashcan associated with the directory.
# rm /usr/dennis/big5
#
#
#
# mkdir /usr/dennis/den_trash
#
# chmod a+w /usr/dennis/den_trash
#
# mktrashcan /usr/dennis/den_trash /usr/dennis
’/usr/dennis/den_trash’ attached to ’/usr/dennis’
#
#
# rm /usr/dennis/big3
#
# ls -li /usr/dennis
total 22815
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1
#
#
# ls -li /usr/dennis/den_trash
total 11376
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
#
# mv /usr/dennis/den_trash/big3 /usr/dennis
#
# ls -li /usr/dennis/den_trash
total 0
# ls -li /usr/dennis
total 34191
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1
Advanced File System Concepts 1-39
Solutions
Striping1. Create an empty striped file. Use the showfile -x command to view the
extents of the empty file.
2. Show the extents of one of your large files.
3. Copy the large file to the empty striped file.
4. Revisit the extents of the striped file. Is there any performance difference in reading the two files?
5. Use #time cat /mnt_pnt/big_file >/dev/null.
Depending on the configuration of the stripe volumes, there should be a performance improvement when using the striped file.
# touch /usr/dennis/stripe1
#
# stripe -n 2 /usr/dennis/stripe1
#
#
# showfile -x /usr/dennis/stripe1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
c.8001 2 16 0 stripe 2 8 async 100% stripe1
extentMap: 1
pageOff pageCnt volIndex volBlock blockCnt
extentCnt: 0
extentMap: 2
pageOff pageCnt volIndex volBlock blockCnt
extentCnt: 0
#
#
# cd /usr/dennis
#
# ls -li
total 34191
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1
12 -rw-r--r-- 1 root system 0 Sep 28 17:39 stripe1
#
# showfile -x big3
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
8.8001 2 16 1422 simple ** ** async 100% big3
1-40 Advanced File System Concepts
Solutions
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1422 2 75504 22752
extentCnt: 1
#showfile -x big2
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
7.8001 1 16 1422 simple ** ** async 100% big2
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1422 1 57760 22752
extentCnt: 1
#
# showfile -x big1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
6.8001 2 16 1422 simple ** ** async 100% big1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1422 2 52592 22752
extentCnt: 1
#
#
# cp big3 stripe1
#
# showfile -x stripe1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
c.8001 2 16 1422 stripe 2 8 async 100% stripe1
extentMap: 1
pageOff pageCnt volIndex volBlock blockCnt
0 8 3 1093536 11392
16 8
32 8
48 8
64 8
(...) 1392 8
1408 8
extentCnt: 1
extentMap: 2
pageOff pageCnt volIndex volBlock blockCnt
8 8 1 80704 11360
24 8
40 8
56 8
(...) 1400 8
1416 6
Advanced File System Concepts 1-41
Solutions
extentCnt: 1
#
#
#
# time cat big3 > /dev/null
real 0m1.61s
user 0m0.00s
sys 0m0.36s
# time cat stripe1 > /dev/null
real 0m1.03s
user 0m0.01s
sys 0m0.36s
1-42 Advanced File System Concepts
2
AdvFS On-Disk Structures
AdvFS On-Disk Structures 2-1
About This Chapter
About This Chapter
IntroductionThis chapter presents information about the AdvFS on-disk structures.
ObjectivesTo describe AdvFS on-disk structures, you should be able to describe on-disk structures associated with these items:
• Bitfiles
• Mcells
• Extent maps
• Tags
• Fragments
• Storage bitmap bitfiles
ResourcesFor more information on topics in this chapter as well as related topics, see the following:
• Advanced File System Administration
• AdvFS Reference Pages
• Header Files
2-2 AdvFS On-Disk Structures
Introducing AdvFS On-Disk Structures
ort for
g the
Introducing AdvFS On-Disk Structures
OverviewThis section provides a platform from which to learn more details of the AdvFS on-disk structures.
• Two-level implementation of AdvFS
• Using the .tags directory
• BAS on-disk format: everything is a bitfile
• Bitfiles
• Mcells
• AdvFS file addresses
• Filesets (formerly referred to as bitfile sets)
• Reusing tags
Two-Level Implementation of AdvFSAdvFS is built using a two-layer strategy separating file access support from file storage support. The two layers are:
• File access system (FAS)
— Higher level of AdvFS
— Transforms bitfiles into normal UNIX files
— Client layer
Think of this as the standard file system components such as directories, suppquotas, mount points, and so forth.
• Bitfile access system (BAS)
— Lowest level of AdvFS providing storage support
— Contains the more complex storage structures supporting and containinmetadata for AdvFS
The following figure depicts the FAS as conceptually connected to the BAS through the .tags directory.
AdvFS On-Disk Structures 2-3
Introducing AdvFS On-Disk Structures
ith a
” ).
d
Figure 2-1: Two-Level Implementation of AdvFS
.tags DirectoryConsider the .tags directory as a way to access the BAS from the context of the FAS. All files and metadata are accessible through .tags by using the file’s tag number or a special metadata file name.
The tag number is conceptually similar to a UFS inode number. Use the ls -li or showfile commands to discover a file’s tag. Tag numbers are associated wsequence number that indicates the number of times this tag has been used.
The .tags directory provides access to files within a mounted fileset using tagnumbers. The .tags directory also provides access to the lower-level, “specialmetadata files through predefined names (M-10 is the BMT, M-6 is the RBMTThe new AdvFS on-disk viewing utilities (nvtagpg, nvbmtpg, nvfragpg, nvlogpg) bury most of the details of accessing metadata files through the .tags directory.
Each AdvFS file system has a .tags directory which allows files to be accesseby tag and sequence number.
• /Advfs_mount_point/.tags/15374
• /Advfs_mount_point/.tags/0x3c0e.8001
On Disk Structures
Bitfile Access SubsystemBAS
Contains AdvFS on-disk metadata.
Bitfile Metadata Table (BMT)Storage Bitmap (SBM)Miscellaneous Bitfile
Root Tag FileFragment BitfileFileset Tag File
(…)
File Access SubsystemFAS
Contains UNIX directory structure.
.tags directory
(connects theFAS with the
BAS)
2-4 AdvFS On-Disk Structures
Introducing AdvFS On-Disk Structures
The following example shows an inode number (tag number) being used to access a file through the .tags directory.
Example 2-1: Tag Number can Access Any File
# ls -litotal 3122894 -rwxr-xr-x 1 root system 31114 Jun 24 15:20 ob_1# # tail -3 ob_1
:of#169728:pf#35135:bf#8192:ff#1024:\:og#99458:pg#149368:bg#8192:fg#1024:\:oh#0:ph#0:bh#8192:fh#1024:
# # tail -3 /usr/.tags/22894
:of#169728:pf#35135:bf#8192:ff#1024:\:og#99458:pg#149368:bg#8192:fg#1024:\:oh#0:ph#0:bh#8192:fh#1024:
#
BAS On-Disk Format: Everything is a BitfileAll AdvFS on-disk structures can be accessed as bitfiles. This includes user files and directories as well as the AdvFS metadata structures.
Bitfiles are arrays of 8K disk pages holding user data or metadata. A series of contiguous 8K pages in a bitfile is stored as an extent.
Each bitfile is identified by its tag, which consists of a tag number or sequence number pair. Use the tag number to locate the extents of a file.
The tag2name program in /sbin/advfs can (usually) translate a tag number to a path name. Note that tag2name has a new format for V5, as shown in the following example.
AdvFS On-Disk Structures 2-5
Introducing AdvFS On-Disk Structures
Example 2-2: Tag Number 22894 Being Translated by tag2name
# /sbin/advfs/tag2name /usr/.tags/22894/usr/bruden/ob_1# # echo $PATH/sbin:/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/bin/X11:/usr/local# # PATH=$PATH:/sbin/advfs# # tag2name usr_domain -S usr 22894open_vol: open for volume "/dev/disk/dsk2g" failed: Device busy# # tag2name -r usr_domain -S usr 22894<== Uses raw device (-r) bruden/ob_1#
The following figure depicts the AdvFS metadata being used to access the bitfiles by referencing a file system directory.
Figure 2-2: Using AdvFS Metadata to Translate FAS to BAS
The following figure shows the logical file as a series of 8K blocks and being represented at the lower level by one or more mcell data structures found in the BAS.
File System Directory
AdvFS Metadata
(…)file1 tag 623file2 tag 51file3 tag 893(…)
File3 on diskLBN80334
(AdvFS sees thisas a bitfile.)
2-6 AdvFS On-Disk Structures
Introducing AdvFS On-Disk Structures
Figure 2-3: BAS On-Disk Format
BitfilesBitfile characteristics include an array of 8K pages and are stored as extents:
• Groups of on-disk contiguous 8K pages
• Managed by extent maps
Bitfiles are dentified by a tag:
• Tag.sequence such as 4714.8001
• Tag number is similar to an inode number
• Sequence number functions as a generation number
All sectors are free or in a bitfile and are managed by mcell chains.
Use the showfile command to find the tag and sequence number.
McellsSeveral metadata bitfiles (RBMT, BMT) have an internal page organization consisting of a page header and a series of mcell data structures. Each mcell can contain a series of variable-length records describing various bitfile attributes and characteristics.
AdvFS locates extents by finding the file’s primary metadata cell (mcell) and stepping through the mcells to find the extent information.
Logical File
On Disk
ownergroupsizemod bits....
extent 1 extent 2(Primary) mcell 292 bytes Contains variable sized records such as;
POSIX attributesextent map records
Additional mcell(s) optional can contain more extent map records if needed
8K Pages
AdvFS On-Disk Structures 2-7
Introducing AdvFS On-Disk Structures
ry s of
is a ess
The tag number is like an inode number, but an mcell functions like an inode.
• It holds permissions, size, extent information, link count and so forth.
• Each mcell is 292 bytes and can fit 28 on an 8K page (plus a 16-byte header).
The following figure shows how a file’s tag number can locate the file’s primamcell in the BMT. The primary mcell provides access to the actual data blockthe file.
Figure 2-4: Tag Number to BMT mcell to Logical Blocks
AdvFS File AddressesThe lowest level of AdvFS is the bitfile access system (BAS). Here every filebitfile, a collection of 8192-byte pages. The higher level of AdvFS, the file accsystem (FAS), enables the bitfiles to appear as normal UNIX files. Both ls (with the -i option) and showfile will print the tag number.
AdvFS files have means of identification: a tag
• Similar to the UFS inode number
• Can be discovered with the ls -i command
— Primary mcell ID
* Component of the lower BAS layer
* Start of a linked list of one or more mcells
* Well hidden from users and system administration
Massaged tag #
Birfile Metadata Table (BMT)
.
.
.lots of mcells
.
.
.
Primary mcell describing andlocating file (will be chained).
.
.
.
2-8 AdvFS On-Disk Structures
Introducing AdvFS On-Disk Structures
file is guish r. The r. ), and
“dead”
l.
g.
Bitfile-SetBitfile-sets have the following characteristics:
• FAS fileset represents a BAS bitfile-set
• Identified by numbers
• Bitfiles are known by:
— Domain ID
— Fileset ID
— Tag or sequence number
Domain IDs can be found using the showfdmn command. Fileset IDs can be found using the showfsets command.
Reusing TagsEach time a file is created, a tag is allocated to represent that file. When the deleted, the tag is recycled. If the tag is selected for reuse, AdvFS can distinbetween the incarnations of the tag by referencing the tag’s sequence numbesequence number is 16 bits, with the leftmost bit used as the “in use” indicatoTherefore sequence number 8003 means the tag is in use (0x8 = 1000 binaryit is the third use of the tag. Of the remaining 15 bits, only 12 are used for sequencing. Therefore a tag can be reused 4096 times before it becomes a tag.
Tag numbers can be reused:
• With file creation and deletion.
• Like inode numbers.
A sequence number identifies various versions of the tag:
• Tags have initial sequence number of 8001 hexadecimal or 32769 decima
• Sequence number is incremented when tag number is reused.
• When sequence number overflows, tag is discarded.
• Leftmost bit indicates tag in use; remaining 15 bits are used for sequencin
AdvFS On-Disk Structures 2-9
Describing BAS On-Disk Metadata Bitfiles
Describing BAS On-Disk Metadata Bitfiles
OverviewTo troubleshoot on-disk problems, an understanding of the metadata bitfiles is mandatory. This section introduces the metadata bitfiles.
• Domain and volume structures
• Bitfile metadata table
• Per domain bitfiles
• Per volume bitfiles
• Per fileset bitfiles
• Reserved bitfile special names
• Metadata bitfile tags
• .tags directory entries for metadata bitfiles
Domain and Volume StructuresEach volume in an AdvFS domain consists of the following structures:
• Reserved bitfile metadata table
• Bitfile metadata table
• Storage bitmap
• Miscellaneous bitfile
In addition to the per-volume structures, each domain also has the following structures. For these structures there is only one per domain and they can reside on any volume in the domain:
• Transaction log
• Root tag file
The following figure illustrates the BAS on-disk metadata bitfiles.
2-10 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
f ead
Figure 2-5: BAS On-Disk Metadata Bitfiles
Per Domain BitfilesEach domain is supported by the following bitfiles:
• The on-disk log:
— Contains the transaction log.
— Is usually 4MB in size.
• The root tag file:
— Lists filesets.
— Is one page in size.
The following example uses the nvtagpg command to display the root tag file othe usr_domain. The -r specifies that the raw device be used for access instof the block device eliminating a device busy error.
Example 2-3: Displaying the Root Tag File
# nvtagpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 96 root TAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5
tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 usrtMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 13 vartMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 2 4 ob_fset#
Reserved BitfileMetadata Table
Storage Bitmap
Root Tag File
On-Disk Log
MISC Bit File
Reserved BitfileMetadata Table
Storage Bitmap
MISC Bit File
Tag File: fileset A
Fragment File: fileset A
Per Volume
Per Domain
Per Fileset
Bitfile Metadata Table Bitfile Metadata Table
AdvFS On-Disk Structures 2-11
Describing BAS On-Disk Metadata Bitfiles
me
Per Volume BitfilesEach volume is supported by the following bitfiles:
• Reserved bitfile metadata table (RBMT)
— Contains mcells for reserved bitfiles
— Last mcell in each RBMT page links to the next page
— Eliminates BMT fragmentation problems
The following example uses the nvbmtpg command to display summary information about the RBMT.
Example 2-4: Displaying RBMT Summary Information
# nvbmtpg -r -R usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "usr_domain" VDI 2 (/dev/rdisk/dsk4b) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.
• Bitfile metadata table (BMT)
— Contains all the mcells for nonreserved bitfiles
— Grows as new files are created
The following example displays summary information about the BMT for voluone in the usr_domain.
Example 2-5: Displaying BMT for Volume 1 of usr_domain
# nvbmtpg -r usr_domain 1 οDOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 48 BMT page 0--------------------------------------------------------------------------There are 1025 pages in the BMT on this volume.The BMT uses 2 extents (out of 33) in 2 mcells.
• Storage bitmap (SBM)
Contains 1 bit for every 8K bytes (1 bit per 1K in Tru64 UNIX V4.0)
• Miscellaneous bitfile
Contains bootblocks and is four pages in size
2-12 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
e
bit is
eded
ary
Per Fileset BitfilesA bitfile-set is BAS nomenclature for FAS fileset. Most references to the term bitfile-set are being replaced with the term fileset to avoid unnecessary confusion. Each fileset is supported by the following bitfiles:
• Tag file (not .tags):
— Translates tag number into location of primary mcell (within BMT) for thappropriate file
— Formerly called the tag directory file
• Tags can be reused as files are deleted:
— Limited to size of associated sequence number
— 8001 is a typical sequence number showing that this tag is in use (left set) and it is in use for the first time (001)
— Limits tag reuse to ~4k times before tag is dead
• Tag file consists of 8K pages with 1022 tagmap entries (8 bytes each) precby a 16-byte header:
— Tagmap entry contains the sequence number, volume index, and primmcell ID (BMT page number and cell number within page)
The following figure shows information from the fileset tag file locating the primary mcell for a file. There may be a chain of mcells describing the file.
AdvFS On-Disk Structures 2-13
Describing BAS On-Disk Metadata Bitfiles
Figure 2-6: Fileset Tag Directory Locating Primary Mcell
The following example uses the tag number of a file to locate the tag directory file entry for the file. The volume, page, and cell information found in the tag directory file is used to access the primary mcell of the file.
Example 2-6: Finding Primary Mcell through Tag Directory
# ls -li big122896 -rwxr-xr-x 1 root system 13729520 Jun 24 16:53 big1# # # nvtagpg -r usr_domain -T 1 -t 22896 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 2055136 "usr" FRAG page 22--------------------------------------------------------------------------currPage 22numAllocTMaps 1022 numDeadTMaps 0 nextFreePage 0 nextFreeMap 0
tMapA[412] tag 22896 seqNo 1 primary mcell (vol,page,cell) 1 951 15 # # # #
Bitfile Metadata Table Fileset Tag File (M1)
.
.
.
tag 893 –Sequence # 3,Volume # 1,
BMT page 811,mcell 7
.
.
.
BMT...
page 811, mcell 7
Extent 80334, 50 pages
.
.
.
2-14 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
# # nvbmtpg -r usr_domain 1 951 15 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 23552 BMT page 951--------------------------------------------------------------------------CELL 15 next mcell volume page cell 2 2 9 bfSetTag,tag 1,22896
RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 1 951 16firstXtnt mcellCnt 2 xCnt 2bsXA[ 0] bsPage 0 vdBlk 573376 (0x8bfc0)bsXA[ 1] bsPage 21 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STATst_mode 100755 (S_IFREG) st_uid 0 st_gid 0 st_size 13729520st_nlink 1 dir_tag 22893 st_mtime Thu Jun 24 16:53:06 1999
A fragment bitfile:
• Contains small files including the last parts of small files.
• Varies in size with number of small files.
• Has tag number 1.
Reserved Bitfile Special NamesThe .tags directory can be used with many special file names which provide user-level access to the BAS bitfiles. These are special names in that they are not visible to standard commands. They have meaning to AdvFS aware software. Any command that uses the VFS I/O component will be AdvFS aware.
The following bitfiles are all accessible under the .tags directory.
Name Function One Per
M-6, M-12, ... Reserved Bitfile Metadata Table Volume
M-7, M-13, ... Storage Bitmap Volume
M-8, M-14, ... Root Tag File Domain
M-9, M-15, ... Transaction Log Domain
M-10, M-16, ... Bitfile Metadata Table Volume
M-11, M-17, ... Miscellaneous Bitfile Volume
M1 Fileset tag file (not .tags) for fileset #1.
Fileset
M2 Fileset tag file (not .tags) for fileset #2.
Fileset
M3 Fileset tag file (not .tags) for fileset #3.
Fileset
(...)
AdvFS On-Disk Structures 2-15
Describing BAS On-Disk Metadata Bitfiles
Mn Fileset tag file (not .tags) for fileset #n.
Fileset
1 Fragment Bitfile Fileset
2 Fileset’s Root Directory Fileset
3 .tags Directory Fileset
4 User Quota File Fileset
5 Group Quota File Fileset
6 User File with Tag # 6 Fileset
7 User File with Tag # 7 Fileset
(...)
n User File with tag # n Fileset
* Next instance of M-n file is M-(n+6).- Not true for Mn files!
Metadata BitfileTagsNonreserved bitfiles (user files) are assigned tags from their tag file. Reserved bitfiles (metadata bitfiles) do not have tags assigned from a tag file because they exist before a tag file exists (such as the root tag file) and because their mcell locations must always be in a known place. Therefore, reserved tags are calculated as follows:
tag = - (reserved-bitfile-primary-mcell-number + (volume-index * 6))
Since tags are used to locate the bitfile’s primary mcell, the BAS translates a reserved tag to the primary mcell by reversing the above calculation (translating the tag to a volume number and an mcell address):
volume-index = tag / 6
reserved-bitfile-primary-mcell-number = -tag % 6
So, for RBMT tags -6 and -12:
-6: -6/6 == volume-index:1
-6: -6%6 == mcell:0
-12: -12/6 == volume-index:2
-12: 12%6 == mcell:0
Tags for reserved bitfiles of virtual disk i are:
tag = - (magic-number-shown-in table + (vol_index * 6)
2-16 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
Metadata bitfile tags can be printed in unusual tags, which effectively translate to negative numbers.
fffffffa.0 RBMT for disk 1, fffffff3.0 SBM for disk 2
.tags for Directory Entries for Metadata BitfilesThe special file names in the .tags directory can take several forms. M-6 is more readable than -6, which would also work. Start with an M:
• Use the negative number for virtual disk-specific files
• Use the fileset ID for fileset tagfiles
For example, if /usr is an AdvFS fileset, use the values in the table.
Bitfile Metadata Table The bitfile metadata table (BMT) holds the support data for user files and directories. It contains location information, permissions and other stats, extent information, fragment location, and other descriptive data. The metadata describing the BMT itself is contained in the RBMT. This avoids the BMT fragmentation problems seen in V4. It also eliminates the need for the -x and -p options on several commands.
Table 2-1: Metadata Bitfile Tags
Reserved File Formula Disk 1 Disk 2
RBMT - (0 + (vol * 6)) -6 -12
SBM - (1 + (vol * 6)) -7 -13
Root tag directory - (2 + (vol * 6)) -8 -14
Log - (3 + (vol * 6)) -9 -15
BMT - (4 + (vol * 6)) -10 -16
Misc Bitfile - (5 + (vol * 6)) -11 -17
Table 2-2: .tags for Metadata Bitfiles
File Description
/usr/.tags/1 Fragment bitfile
/usr/.tags/M-6 RBMT of disk 1
/usr/.tags/-6 RBMT of disk 1 also
/usr/.tags/M-15 Log of disk 2
/usr/.tags/M2 Tag file for second fileset
AdvFS On-Disk Structures 2-17
Describing BAS On-Disk Metadata Bitfiles
BMT is represented by a file found under the .tags directory. The special file name is M-10. It contains a series of 8K pages just like any other AdvFS file, however, its pages contain mcells (292 bytes each) and header information.
Each mcell contains one or more variable-length records describing various file attributes, extents, permissions, fragment info, and so forth.
The bitfile metadata table can grow just as any file can grow; it just adds another extent. It starts with slightly more than 1M (can be tailored).
BMT is created when the mkfdmn command is issued. There is one BMT for each volume in the domain.
The BMT stores bitfile metadata, including:
• Bitfile attributes
• Bitfile extent maps
• Bitfile set attributes
• FAS file attributes including the POSIX file stats
The BMT is an array of 8KB pages where each page consists of a header and an array of fixed-size metadata cells (mcells), where each mcell contains one or more variable-length records. The records are typed (for example, bitfile attributes, or extent map).
BAS record types are defined in src/kernel/msfs/msfs/bs_ods.h and ms_public.h.
The BMT contains all mcells for all files other than the reserved bitfiles:
• User files
• User directories
The mcells for the reserved bitfiles are in the RBMT.
RBMT and BMT:
• First mcell describes itself.
• Grows using extents as more mcells are needed.
• RBMT reserves the last mcell on each page to chain to other pages of mcells.
The following figure shows the on-disk layout for most of the reserved bitfiles.
2-18 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
Figure 2-7: Reserved Bitfiles On Disk Layout
Mcell RecordsThese characteristics describe mcells and the records within them.
• Inodes of AdvFS are 28 fixed-size (292 byte) mcells packed into 8K pages
• One or more linked mcells describe bitfiles
• First mcell in list is primary mcell
• Each mcell contains variably sized records describing attributes of the bitfile
Miscellaneous Bitfile (M-11)Pages 0, 1
Sectors 0-31
RBMT (M-6)Page 0
Sectors 32-47
BMT (M-10)Page 0
Sectors 48-63
Miscellaneous Bitfile (continued)Pages 2,3
Sector 64-95
Root Tag Directory (M-8)Page 1
Sectors 96-111
Storage Bitmap (M-7)(1 bit per 8k cluster)
Sectors 112-?
Transaction Log (M-9)512 PagesSectors ?
Fileset Tag Directory File (M1)8 PagesSectors ?
Mount Point Directory for Fileset (2)1 PageSector ?
.tags Directory (3)1 PageSector ?
Quota.user (4)1 PageSector ?
Quota.Group (5)1 PageSector ?
AdvFS On-Disk Structures 2-19
Describing BAS On-Disk Metadata Bitfiles
Record types contained in the BMT and the RBMT include:
• Extent maps (of various kinds)
• Bitfile attributes (clone, original, and so forth)
• Domain attributes
• Virtual Disk attributes (disk ID, disk index)
• Fragment attributes
• POSIX file stats (permissions, size, link count)
• Symbolic link targets
Mcell Page StructureThis figure illustrates the structure of a page full of mcells.
Figure 2-8: Mcell Page Structure
mcell header
record
record
record
mcell header
record
record
record
mcell header
record
record
record
page header
mcell
mcell
mcell
Page
28 mcellsper page
variable sizedrecords
2-20 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
RBMT Page 0RBMT page 0 starts at sector 32 (LBN 32) and contains these primary mcells:
• Mcell 0 reserved bitfile metadata table (RBMT)
• Mcell 1 storage bitmap (SBM)
• Mcell 2 root tag file (optional, one per domain)
• Mcell 3 log (optional, one per domain)
• Mcell 4 bitfile metadata table (BMT)
• Mcell 5 miscellaneous bitfile
RBMT also contains all secondary mcells (extent maps) for the BMT.
BMT Page 0The BMT page 0 includes the head of the BMT page free list. Any BMT page that contains at least one free mcell is on this free list. The free list head is maintained in the first mcell in BMT page 0. Note that BMT page 0 is not included in this free list.
BMT page 0 starts at Sector 48. Mcell 0 is the head of the BMT page free list. It contains mcells for nonreserved bitfiles (that is user files and directories). All other BMT pages are found via the RBMT.
BMT Page FormatEach 8192-byte page is comprised of a 16-byte header followed by twenty-eight 292-byte mcells.
The BMT header consists of:
• Pointer to next free mcell on page
• Pointer to next page with free mcells
• Number of free mcells on the page
• Page number (within BMT)
• AdvFS version (now 4)
AdvFS On-Disk Structures 2-21
Describing BAS On-Disk Metadata Bitfiles
bers
The following example shows an excerpt from bs_ods.h.
Example 2-7: BMT Page Structure
typedef struct bsMPg { bfMCIdT nextfreeMCId; /* Next free MCId on the page */ uint32T nextFreePg; /* Next page in the mcell free list */ uint32T freeMcellCnt; /* Number of free mcells on this pg */ uint32T pageId : 27; /* Page number */ uint32T megaVersion: 5; /* Overall structure version */ struct bsMC bsMCA[BSPG_CELLS]; /* Array of Bs Cells */} bsMPgT;
Mcell AddressesMcells can be located in several ways. The mcell is most often found through the tag file; the BMT page number and mcell number (within page) are all that is needed. Mcells are addressed by a 32-bit mcell ID, bfMCIdT.
• 27 bits give the mcell’s BMT page number.
• 5 bits give the mcell’s position within its page.
Every bitfile has a primary mcell.
• Tagfiles map (tag file).
• Tag numbers can be mapped to primary mcell locations.
Use the nvtagpg command to find out how this is done.
Reserved Mcell AddressesWithin RBMT page 0, the slot numbers listed are used to calculate the tag numfor the reserved bitfiles:
0 RBMT itself
1 SBM (storage bitmap)
2 Root tag file
3 Transaction log
4 BMT
5 Miscellaneous bitfile
2-22 AdvFS On-Disk Structures
Describing BAS On-Disk Metadata Bitfiles
Mcell FormatEach mcell begins with a 24-byte header:
• 32-bit ID of the next mcell in the chain and virtual disk containing next mcell
• Position of this mcell within the chain
• Tag number of the bitfile
• Tag number of the fileset
Each mcell has 268 bytes remaining for mcell records.
The following bsMC structure is found in msfs/msfs/bs_ods.h:
typedef struct bsMC { bfMCIdT nextMCId; /* Link to next mcell */ uint16T nextVdIndex; /* vd index of next mcell */ uint16T linkSegment; /* segment in link, starts at zero */ bfTagT tag; /* Tag this mcell is assigned to */ bfTagT bfSetTag; /* tag of this bitfile’s bf set dir */ char bsMR0[BSC_R_SZ]; /* Records */} bsMCT;
Mcell RecordsMcell contents are arranged in variable-length records. They vary in size and type and begin with a 4-byte header.
• 2-byte count of record size
• 1-byte type field
There are now about 20 record types:
• 1-byte version number for different versions of a type
• Null record whose size is 4 and type is 0
This is the end-of-records indicator.
The following bsMR structure is found in msfs/msfs/bs_ods.h:
typedef struct bsMR { uint32T bCnt : 16; /* Count of bytes in record */ uint32T type : 8; /* Type of structure contained by record */ uint32T version : 8; /* Version of the record’s type */} bsMRT;
AdvFS On-Disk Structures 2-23
Describing BAS On-Disk Metadata Bitfiles
Utilities for Viewing RecordsThe table shows the utilities for viewing records.
These utilities are found in /sbin/advfs.
nvbmtpg Displays all records of a BMT (or RBMT) page
nvfragpg Displays the pages of an AdvFS fragment file
nvlogpg Displays the pages of the log file
nvtagpg Displays the pages of a tag file
vsbmpg Displays pages from the SBM
vfilepg Displays the pages of any AdvFS file
savemeta Takes a snapshot of all of a domain’s metadata
2-24 AdvFS On-Disk Structures
Using Extent Maps
ells extent
ion, ds.
t to
Using Extent Maps
OverviewExtent maps are stored in mcells linked to a bitfile’s primary mcell. Since mcare of limited size, an extent map may span several mcells (these pieces of themap in different mcells are called subextent maps).
When a bitfile is created, AdvFS allocates a primary extent map record in theprimary mcell. When an extra extents record is filled with extent map informatthe extent map can be extended indefinitely with additional extra extent recor
• Extent maps for nonreserved files
• Extent maps for reserved files
• Encoding of extents
Extent Maps for Nonreserved FilesThese characteristics are common to extent maps for nonreserved files:
• Primary extent map record
— Within the primary mcell
— Allocated when file gets a page
— If full, points to an extra extent map
• Extra extent map record is allocated as the file grows
For striped files, extent map records use a shadow extent map and may poinmore than one disk.
Extent Maps for Reserved FilesExtent maps for reserved files have these characteristics:
• Primary extent map (if full, points to extra extent map)
• Extra extent map (usually only needed for the BMT itself)
All records for reserved files are in the RBMT.
Encoding of ExtentsThese characteristics describe extents:
• Extents information is captured in two fields
— Page number within bitfile
— Block number within virtual disks
AdvFS On-Disk Structures 2-25
Using Extent Maps
d by lls.
• Size of one extent is inferred from the next
— Compute difference between the page numbers
— Place -1 in block number to indicate a hole (-2 for clone hole)
• Try out the following:
— Page = 0, block = 8000
— Page = 100, block = -1
— Page = 200, block = 9600
— Page = 300, block = -1
This format can be confusing. You will often see that an extent count displayethe showfile -x command is one less than seen when visiting the file’s mceThe above sequence indicates a file with three extents:
• First 100 pages start at LBN 8000,
• Next 100 pages are empty (a hole in the file)
• Final 100 pages start at LBN 9600.
The following structure is found in /msfs/msfs/bs_ods.h.
Example 2-8: Extent Structure
typedef struct bsXtnt { uint32T bsPage; /* Bitfile page number */ uint32T vdBlk; /* Logical (disk) block number */} bsXtntT;
2-26 AdvFS On-Disk Structures
Using Tags
The
Using Tags
OverviewThis section discusses the function of the fileset tag files. Do not confuse these files with the .tags directory.
• Tag file characteristics
• Tag file page
• Tagmap entries
• Root tag bitfile
• Fileset tag file
• Cloning through fileset tag file
• Utility for viewing tag file
• UNIX directories
• POSIX files
• AdvFS tag files and migration
Tag File CharacteristicsTag files have these characteristics:
• Bitfiles are identified by tags
• Tag files are:
— Arrays of tagmap entries
— Indexed by tag number
Tag file entries contain:
• Sequence number (high bit set if in use)
• Volume index
• Mcell ID within volume
Tag files are used to translate a bitfile tag to the location of its primary mcell. file is indexed by tag number and the entry contains the logical address of theprimary mcell, which is a tuple of the following format:
<volume index, BMT page number, mcell's index within the BMT page>
AdvFS On-Disk Structures 2-27
Using Tags
ks
the
A bitfile tag consists of a tag number and a sequence number. Whenever a bitfile is deleted, its tag is placed back on the free list. However, for various consistency reasons (like crash recovery) AdvFS cannot reuse the tag unless it is made unique from previous uses of that tag. Therefore, each time a tag is reused, its sequence number is incremented to differentiate it from the previous use of the same tag. The sequence number has a limited number of bits, so a tag can be used 4096 times and then it becomes a dead tag, never to be reused again.
Tag files introduce an extra level of overhead in accessing file data.
The following figure depicts an FAS file access request directed to the tag file to get the location of the primary mcell. The file’s mcells will point to the logical blocof the file.
Figure 2-9: File Access Through Tag File
The following figure shows the file’s data moving physically with no changes inFAS level structure information.
File System Directory Fileset Tag File (M1)(…)
file1 tag 623file2 tag 51file3 tag 893
(…)
. . .tag 893 –
Sequence # 3,Volume # 1,
BMT page 811,mcell 7
. . .
BMT. . .
page 811, mcell 7Extent 80334, 50 pages
. . .
File on disk LBN 80334
2-28 AdvFS On-Disk Structures
Using Tags
Figure 2-10: Tag File Allowing Transparent Data Move
Tag File PageThe tag file page starts with a five-field, 16-byte header:
• Number of this page (for sanity checking)
• Next page with free tagmap entries
• Next free tagmap within this page
• Number of allocated tagmaps
• Number of dead tagmap entries
The tag file page is followed by 1022 tagmap entries
The following structure is found in msfs/msfs/bs_ods.h.
Example 2-9: Tag File Page Header Structure
typedef struct bsTDirPgHdr { uint32T currPage; /* page number of this page */ uint32T nextFreePage; /* next page having free TMaps */ uint16T nextFreeMap; /* index of next free TMap, 1 based */ uint16T numAllocTMaps; /* count of allocated tmaps */ uint16T numDeadTMaps; /* count of dead tmaps */ uint16T padding;} bsTDirPgHdrT;
File System Directory Fileset Tag File (M1)(… )
file1 tag 623file2 tag 51file3 tag 893
(… )
. . .tag 893 –
Sequence # 3,Volume # 1,
BMT page 811,mcell 7
. . .
BMT. . .
page 811, mcell 7Extent 88526, 50 pages
. . .
File on disk LBN 88526
AdvFS On-Disk Structures 2-29
Using Tags
Tagmap EntriesThree different formats for entries in a tag file:
• Head of free tagmap list
— Stored in first slot of page 0
— Points to:
* First page with a free tagmap entry
* First uninitialized page
• Element of free tagmap list
— Sequence number
— Pointer to next free entry on this page
• Allocated tagmap entry
— Sequence number
— Virtual disk of bitfile
— Mcell ID of bitfile
The following structure is found in msfs/msfs/bs_ods.h:
Example 2-10: Tagmap Structures
Note that the * sequence number is only twelve bits. The 4-tuple * * <domain id, bitfile set tag, bitfile tag, sequence number> * * must be unique for all time. When the sequence number wraps the * slot containing the tagmap struct becomes permanently unavailable. */typedef struct bsTMap {
union { /* * First tagmap struct on page zero only. */ struct { uint32T freeList; /* head of free list */ uint32T unInitPg; /* first uninitialized page */ } tm_s1;
/* * Tagmap struct on free list. */ struct { uint16T seqNo; /* must overlay seqNo in tm_s3 */ uint16T unused; /* padding to 4 byte boundary */
2-30 AdvFS On-Disk Structures
Using Tags
et’s is copy
and ne
uint32T nextMap; /* next free tagmap struct within page */ } tm_s2;
/* * In use tagmap struct. */ struct { uint16T seqNo; /* must overlay seqNo in tm_s2 */ uint16T vdIndex; /* virtual disk index */ bfMCIdT bfMCId; /* bitfile mcell id */ } tm_s3; } tm_u;} bsTMapT;
Root Tag FileThe root tag file:
• Contains entries for each fileset of the domain.
• Contains mcell IDs of fileset tagfiles.
• Can find the list of domain filesets
Since you can have many filesets within a domain, there must be a way to locate the tag file that is pertinent to each fileset. This is accomplished using the root tag file.
Fileset Tag File Each fileset has its own fileset tag file.
• Maps fileset bitfiles to primary mcells
• Has these special fileset tags
— 1 fragment bitfile
— 2 root directory
— 3 .tags
— 4 user quota file
— 5 group quota file
Cloning through Fileset Tag FileCloning is accomplished by creating a new fileset and copying the original filestag file information. Note that the original fileset’s data is not copied unless italtered after the clone is created. If the original data is altered, the clone gets aof the unchanged data only. A clone is a read-only mechanism to be created used for short-term operations (such as backups) and then removed. The closhould be recreated if needed again.
AdvFS On-Disk Structures 2-31
Using Tags
A clone fileset is a read-only, virtual copy of the data as it existed at the time the clone was created.
As original data changes while the clone exists, a:
• Copy of the original data must be created.
• New mcell must be allocated in the BMT.
The appropriate entry in the new copy of the fileset tag file must be updated to point to the cloned page’s new mcell in the BMT.
The figure shows the fileset tag file without a clone (on the left) and then shows how the structures change when a clone is created.
Figure 2-11: Fileset Tag File Before and After Cloning
No data is copied unless a change is made to the original data. Only thedata about to be changed is copied. A clone is effectively a snapshot of the data at a known time.
The following figure depicts the fileset tag file after a change has been made to the original data.
No Clone After clonefset command
Fileset Tag File (M1)
. . . . . tag 85, mcell 14 . . . . . . .
BMT
mcell 14, LBN 919
original data(LBN 919)
Fileset Tag File (M2)(clone)
. . . . . tag 85, mcell 14 . . . . . . .
Fileset Tag File (M1)
. . . . . tag 85, mcell 14 . . . . . . .
2-32 AdvFS On-Disk Structures
Using Tags
Figure 2-12: Clone Structures After Data Write
Utility for Viewing Tag FilesThe nvtagpg utility prints formatted pages of a root tag file or a fileset tag file.
The example shows how to use the root tag file.
Example 2-11: Displaying Root Tag File Information
# nvtagpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 96 root TAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5
tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 usrtMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 13 vartMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 2 4 ob_fset
No Clone After Data has been Changed
Fileset Tag File (M1)
. . . . . tag 85, mcell 14 . . . . . . .
BMT
mcell 14, LBN 919
mcell 22, LBN 1214
original data(LBN 919)includingchanges
Fileset Tag File (M2)(clone)
. . . . . tag 85, mcell 22 . . . . . . .
Fileset Tag File (M1)
. . . . . tag 85, mcell 14 . . . . . . .
original data(LBN 1214)
copied beforechanges weremade in LBN
919
AdvFS On-Disk Structures 2-33
Using Tags
The example shows how to use the fileset tag file.
Example 2-12: Displaying Fileset Tag File
# nvtagpg -r usr_domain usr ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 1047552 "usr" FRAG page 0--------------------------------------------------------------------------currPage 0numAllocTMaps 1021 numDeadTMaps 0 nextFreePage 23 nextFreeMap 0
tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 3 tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 4 tMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 1 0 6 tMapA[4] tag 4 seqNo 1 primary mcell (vol,page,cell) 1 0 7 tMapA[5] tag 5 seqNo 1 primary mcell (vol,page,cell) 1 0 8
(...)
Given a particular tag number, divide by 1022:
• Quotient is the page
• Remainder is tagmap slot
The example shows how to use the individual file’s tag file entry.
Example 2-13: Displaying Individual File’s Tag File Entry
# ls -li big122896 -rwxr-xr-x 1 root system 13729520 Jun 24 16:53 big1# # nvtagpg -r usr_domain -T 1 -t 22896 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 2055136 "usr" FRAG page 22--------------------------------------------------------------------------currPage 22numAllocTMaps 1022 numDeadTMaps 0 nextFreePage 0 nextFreeMap 0
tMapA[412] tag 22896 seqNo 1 primary mcell (vol,page,cell) 1 951 15 # # bc 22896/10222222896%1022412^D #
2-34 AdvFS On-Disk Structures
Using Tags
UNIX DirectoriesUNIX directories are contained in standard bitfiles.
AdvFS format is similar to UFS format except the:
• Tag numbers replace inode numbers.
• 64-bit tag.sequence ID is hidden in the padding at the end of each component entry.
Two levels of directories support file migration between disks.
AdvFS directories have the same basic structure as UFS directories, except that the complete bitfile tag is stored after the file name in each entry instead of an inode number. Directories are extended on one 8KB page at a time. Each 8KB page is subdivided into 512-byte sections. Each section contains variable-length entries that translate a file name to an AdvFS bitfile tag (unique identifier). Each entry has the following format:
• Tag number: 32 bits
• Entry length: 16 bits
• Name length: 16 bits
• Name: variable-length, zero-padded to nearest 32-bit boundary
• Tag+sequence#: 64 bits
POSIX FilesThe following figure shows how a directory file points to a fileset tag file, to an entry in the BMT and ultimately to a POSIX file.
AdvFS On-Disk Structures 2-35
Using Tags
Figure 2-13: Relationship to POSIX Files
AdvFS Tagfiles and MigrationWhy does AdvFS have this extra level in the lookup path to a bitfile’s metadata? Tag files are key structures that enable AdvFS to migrate bitfiles in a way that is transparent to the FAS. Migration relies on two key features:
• The ability to move a bitfile’s data transparently; this is supported by the buffer cache and I/O scheduling algorithms.
• The ability to move a bitfile’s metadata to another volume; tag files enable this.
When migration moves a bitfile’s metadata to another volume, it simply updates the bitfile’s tag directory entry to point to the new metadata location. This makes migration a purely BAS issue or feature since tag directories are part of the BAS layer and the FAS layer’s structures are left unchanged.
Directory File
Fileset Tag File
BMT
File Data
2-36 AdvFS On-Disk Structures
Assigning Fragments
Assigning Fragments
OverviewA file that is not an exact multiple of 8K in size will most likely have a fragment assigned to it. This fragment will hold the excess data in a piece of disk storage represented in a fragment bitfile.
• Fragment bitfile
• Fragment groups
• Fragment header
• Fragment utilities
• Fragments and files
Fragment BitfileThese are characteristics of fragment bitfiles:
• One per fileset
• Contains small (< 8K) ends of files
• Allocated in 8K (16 sector) units
If the file is less than 100K, a fragment is used if necessary.
This biases the fragmentation algorithm toward smaller files. Why make special arrangements for the final fragment (<8K) of a 300M file?
The fragment bitfile is divided into a series of 128Kb fragment groups.
Fragment GroupsEach fragment group consists of fragments of a particular size (free, or 0K, 1K, 2K,...7K)
Each fragment group has:
• One page fragment header.
• Fragments sized for the group.
Fragment addresses are usually:
• In 1Kpages.
• Relative to start of fragment bitfile.
AdvFS On-Disk Structures 2-37
Assigning Fragments
Fragments are not extents.
• An extent is a contiguous range of 8K pages.
• A fragment is a chunk of disk space between 1K and 7K in size.
The figure shows the fragment bitfile locating various fragment groups.
Figure 2-14: Fragment Bitfile Locating Fragment Groups
Fragment HeaderFields in the fragment header (1024 bytes) include:
• Pointer to next fragment group of this type with free space
• Page number of this page (a sanity check)
• Type of this group
• Number of free fragments
• Fileset ID
• Version
• List of free fragments in the group
The following structure is found in msfs/msfs/bs_bitfile_sets.h.
1k Listhead
2k Listhead
3k Listhead
4k Listhead
5k Listhead
6k Listhead
7k Listhead
List of 2k frags| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
List of 4k frags| | | | | | | | | | | | | | | |
2-38 AdvFS On-Disk Structures
Assigning Fragments
Example 2-14: Fragment Group Header Structure
typedef struct grpHdr { uint32T nextFreeFrag; /* frag index (valid only when "version == 0") */ uint32T lastFreeFrag; /* frag index (valid only when "version == 0") */ uint32T nextFreeGrp; /* page number */ uint32T self; /* this group’s starting page number */ bfFragT fragType; /* type of frags in this group */ int freeFrags; /* number of free frags in the group */ bfSetIdT setId; /* bitfile-set’s ID */
/* * the following fields were added in ADVFS v3.0 * they were all zeros in pre-ADVFS v3.0 */
unsigned int version; /* metadata version pre-ADVFS v1.0 == 0, ADVFS v3.0 == 1 */ uint32T firstFrag; /* frag index */
/* * the following is used as a map of the free frags in the group. * it is a linked list where element zero (0) is used as the head * of the list (since frag 0 is always the group header it can * never be allocated so element zero would otherwise be unused) */ unsigned short freeList[ BF_FRAG_GRP_SLOTS ];} grpHdrT;
Fragment UtilitiesThe nvfragpg utility displays statistics about fragment use.
This example shows summary statistics.
Example 2-15: Fragment Group Statistics Display Using nvfragpg
# nvfragpg -r usr_domain usr ==========================================================================DOMAIN "usr_domain" --------------------------------------------------------------------------reading 438 frag group headers, 400 headers readfrag type free 1K 2K 3K 4K 5K 6K 7K totalsgroups 2 44 67 80 76 63 52 54 438frags - 5588 4254 3386 2413 1600 1100 979 19320frags used - 5469 4245 3364 2405 1581 1093 973 19130disk space 256K 5632K 8576K 10.0M 9728K 8064K 6656K 6912K 54.8Mspace used - 5469K 8490K 9.9M 9620K 7905K 6558K 6811K 53.7Mspace free 254K 119K 18K 66K 32K 95K 42K 42K 668Koverhead 2K 44K 67K 80K 76K 63K 52K 54K 438Kwasted - 0K 67K 80K 228K 126K 52K 54K 607K% used - 97% 98% 98% 98% 98% 98% 84% 98%
AdvFS On-Disk Structures 2-39
Assigning Fragments
This example shows a fragment free list.
Example 2-16: Fragment Free List Display Using nvfragpg
# nvfragpg -fr usr_domain usr ==========================================================================DOMAIN "usr_domain" --------------------------------------------------------------------------reading 438 frag group headers, 100 headers read reading 438 frag group headers, 200 headers read reading 438 frag group headers, 300 headers read reading 438 frag group headers, 400 headers readfrag type free 1K 2K 3K 4K 5K 6K 7K totalsgroups 2 44 67 80 76 63 52 54 438frags - 5588 4254 3386 2413 1600 1100 979 19320frags used - 5469 4245 3364 2405 1581 1093 973 19130disk space 256K 5632K 8576K 10.0M 9728K 8064K 6656K 6912K 54.8Mspace used - 5469K 8490K 9.9M 9620K 7905K 6558K 6811K 53.7Mspace free 254K 119K 18K 66K 32K 95K 42K 42K 668Koverhead 2K 44K 67K 80K 76K 63K 52K 54K 438Kwasted - 0K 67K 80K 228K 126K 52K 54K 607K% used - 97% 98% 98% 98% 98% 98% 84% 98%
head of free lists of frag groups from fileset attributes:frag type BF_FRAG_ANY firstFreeGrp 6976 lastFreeGrp 32frag type BF_FRAG_1K firstFreeGrp 6960 lastFreeGrp 560frag type BF_FRAG_2K firstFreeGrp 6880 lastFreeGrp 6880frag type BF_FRAG_3K firstFreeGrp 6944 lastFreeGrp 848frag type BF_FRAG_4K firstFreeGrp 6832 lastFreeGrp 816frag type BF_FRAG_5K firstFreeGrp 6896 lastFreeGrp 864frag type BF_FRAG_6K firstFreeGrp 6768 lastFreeGrp 6768frag type BF_FRAG_7K firstFreeGrp 6864 lastFreeGrp 6864
any 6976 6992free 1K 6960 6912 6928 560full 1K 0 112 128 176 224 320 400 496 944 1184 1616 1840 2928 3184 3280 3424 3568 3808 3856 3904 3936 3968 4192 4544 4816 4864 4912 4928 4944 5120 5344 5456 5536 5600 5760 6240 6528 6608 6640 6848free 2K 6880full 2K 64 208 256 272 288 336 480 624 784 928 1040 1168 1264 1520 1664 2480 2736 3024 3056 3072 3088 3104 3120 3200 3360 3504 3792 3824 3872 3888 3952 4032 4080 4144 4160 4288 4304 4352 4400 4448 4496 4528 4560 4592 4608 4624 4656 4688 4704 4896 4976 5168 5360 5488 5552 5584 5616 5664 5824 5984 6144 6320 6480 6560 6624 6656
(...)
free 7K 6864full 7K 80 384 512 544 656 704 768 832 912 1024 1328 1440 1568 1680 1792 2528 2608 2688 2800 2944 3168 3296 3408 3488 3536 3584 3600 3632 3664 3680 3712 4048 4224 4432 4784 5072 5184 5200 5232 5264
2-40 AdvFS On-Disk Structures
Assigning Fragments
file’s
opy
5472 5680 5792 5888 6000 6080 6128 6224 6336 6400 6448 6672 6816
Fragments and FilesAdvFS makes special arrangements to handle fragments of files. One of themcell records found in the BMT will have fragment information.
• POSIX stats record of mcells contains fragId field.
• fragId.frag is the page offset of fragment.
• fragId.type is the size of fragment.
• Use showfile to determine if there is a fragment.
Does number of pages match file size?
• Use nvbmtpg to find fragment location.
After you have found a fragment location, you can copy it using dd, with a command similar to:
dd if=/users/.tags/1 of=/tmp/frag.cpy bs=1024 iseek=717 count=2
The following example hunts down a fragment of a file. The test file ob_1 is a cof the /etc/disktab file.
Example 2-17: Tracking Down a Fragment
# ls -li ob_122894 -rwxr-xr-x 1 root system 31114 Jun 24 15:20 ob_1# # # nvbmtpg -r usr_domain usr 22894 -c ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 23552 BMT page 951--------------------------------------------------------------------------CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag 1,22894
RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 572976 (0x8be30)bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STATst_mode 100755 (S_IFREG) st_uid 0 st_gid 0 st_size 31114st_nlink 1 dir_tag 22893 st_mtime Thu Jun 24 15:20:45 1999fragId.type BF_FRAG_7K fragId.frag 54990
#
AdvFS On-Disk Structures 2-41
Assigning Fragments
# dd if=/usr/.tags/1 of=/tmp/obfrag bs=1024 iseek=54990 count=22+0 records in2+0 records out# # # cat /tmp/obfrag:
ra71|RA71|DEC RA71 Winchester:\ :ty=winchester:dt=MSCP:ns#51:nt#14:nc#1915:\ :oa#0:pa#131072:ba#8192:fa#1024:\ :ob#131072:pb#262144:bb#8192:fb#1024:\ :oc#0:pc#1367310:bc#8192:fc#1024:\ :od#393216:pd#324698:bd#8192:fd#1024:\ :oe#717914:pe#324698:be#8192:fe#1024:\ :of#1042612:pf#324698:bf#8192:ff#1024:\ :og#393216:pg#819200:bg#8192:fg#1024:\ :oh#1212416:ph#154894:bh#8192:fh#1024:
(...)
The FAS layer uses fragments in the following way. When a write exceeds the fragment size, a page is allocated to the file and the fragment is copied to the new page and the fragment is deallocated.
2-42 AdvFS On-Disk Structures
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
Overview The storage bitmap file (SBM) represents the storage within an AdvFS volume. The miscellaneous bitfile represents a fake superblock and other disk overhead structures typically found in a volume.
• Storage bitmap file characteristics
• SBM format
• Miscellaneous bitfile
Storage Bitmap Bitfile CharacteristicsHow does AdvFS know which disk blocks are free and which are in use? The storage bitmap file:
• Represents each 8K of on-disk storage within a volume with 1 bit (1K per bit in DIGITAL UNIX V4.0)
• Is basically a group of bits representing whether or not storage is in use.
• Is .tags/M-7
Characteristics of the SBM bitfile include:
• One per volume
• All storage is either free or in a bitfile
• AdvFS storage is allocated in clusters
1 cluster == 2 sectors == 1024 bytes
• On-disk SBM is little more than an array of bits (1 bit per page)
Each AdvFS volume contains a storage bitmap which keeps track of allocated disk space. In AdvFS terminology, a block is a 512-byte sector, a cluster is one more contiguous block, and a page is 16 blocks. Each bit in the storage bitmap represents a page. If the bit is set, the page is allocated to a bitfile; if the bit is clear, the page is free (available for allocation). The cluster size is definable on a volume basis, however AdvFS currently uses a cluster size of two blocks (1K byte) for all volumes. The bigger the page size, the smaller the bitmap.
The storage bitmap is structured as an array of 8KB pages where each page consists of an array of 32-bit integers (each bit represents a page). Each page also contains a header containing an XOR checksum of the integer array.
AdvFS On-Disk Structures 2-43
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
SBM FormatSBM page format consists of:
• SBM page header
— Sequence number, now unused (32 bits)
— XOR field for rest of page (32 bits)
• Bitmap
— 65472 bits
— Enough for 8184 pages (8K each)
The following example shows access to SBM through .tags.
Example 2-18: SBM Display Through .tags/M-7
# od -x -N 1000000 /var/.tags/M-70000000 0000 0000 ffff 00ff ffff ffff ffff ffff0000020 ffff ffff ffff ffff ffff ffff ffff ffff*0001220 ffff 00ff 0000 0000 0000 0000 0000 00000001240 0000 0000 0000 0000 0000 0000 0000 0000*0020000 0000 0000 0000 ffff ffff ffff ffff ffff0020020 ffff ffff ffff 00ff ffff ffff ffff ffff<== Big bitmap.0020040 ffff ffff ffff ffff ffff ffff ffff ffff*0020100 ffff ffff ffff ffff 0000 ff00 ffff ffff0020120 ffff 00ff ffff ffff ffff ffff ffff ffff0020140 ffff ffff ffff ffff ffff ffff ffff ffff*
(...)
0040000 0000 0000 0000 ff00 0000 0000 0000 00000040020 0000 0000 0000 0000 0000 0000 0000 0000*0040500 0000 0000 0000 0000 0000 ff00 0000 00000040520 0000 0000 0000 0000 0000 0000 0000 0000*0100000# # # showfile -x /var/.tags/M-7
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf Filefffffff9.0000 1 16 4 simple ** ** ftx 100% M-7
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 4 1 112 64 extentCnt: 1#
2-44 AdvFS On-Disk Structures
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
# # ls -li /var/.tags/M-74294967289 ---------- 0 root system 24576 Dec 31 1969 /var/.tags/M-7#
To see if a particular page, say 40000, is free:
1. Divide 40000 by 8184.
Quotient = 4, page number within SBM
Remainder = 7264, byte offset into page 4 bitmap array
2. Byte offset into SBM for page 40000 is 4 * 8192 + 7264 + 8 or 40040 ( 40000 + (4+1)*8, is easier).
3. Read byte with the od command:
od -x -j 40040 -N 1 /usr/.tags/M-7
This example shows information on the allocation status of page 40000.
Example 2-19: SBM Page Allocation Status Display Using vsbmpg
# vsbmpg -r usr_domain 1 -B 40000 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 112 SBM page 0--------------------------------------------------------------------------block 40000 (0x9c40) is in sbm index 625 mapInt[625] 00000000 00000000 00000000 00000000 block 40000 ^#
This example shows summary SBM information displayed by vsbmpg.
Example 2-20: SBM Summary Information Display
# vsbmpg -r usr_domain ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) --------------------------------------------------------------------------There are 3 pages in the SBM on this volume.The volume has 2532999 blocks (158312 pages).11480 pages (7%) are used.
==========================================================================DOMAIN "usr_domain" VDI 2 (/dev/rdisk/dsk4b) --------------------------------------------------------------------------There are 1 pages in the SBM on this volume.The volume has 262144 blocks (16384 pages).236 pages (1%) are used.
AdvFS On-Disk Structures 2-45
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
This example shows all SBM pages for a volume.
Example 2-21: SBM Pages Displayed Using vsbmpg
# vsbmpg -r usr_domain 1 -a ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 112 SBM page 0--------------------------------------------------------------------------lgSqNm 0 xor 6aa2212b index block mapInt[] 0 0 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 4 100 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 8 200 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 12 300 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 16 400 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 20 500 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 24 600 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 28 700 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 32 800 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 36 900 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 40 a00 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 44 b00 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 48 c00 000003ff 00000000 00000000 00000000 *. 52 d00 00000000 00000000 00000000 00000000 56 e00 00000000 00000000 00000000 00000000 60 f00 00000000 00000000 00000000 00000000 64 1000 00000000 00000000 00000000 00000000 68 1100 00000000 00000000 00000000 00000000 72 1200 00000000 00000000 00000000 00000000 76 1300 00000000 00000000 00000000 00000000 80 1400 00000000 00000000 00000000 00000000 84 1500 00000000 00000000 00000000 00000000 88 1600
(...) 1028 10100 00000000 00000000 00000000 00000000 1032 10200 00000000 00000000 00000000 00000000 1036 10300 00000000 ff000000 ffffffff ffffffff . .... .... 1040 10400 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 1044 10500 ffffffff ffffffff ffffffff ffffffff .... .... .... .... 1048 10600
(...) 2036 1fd00 00000000 00000000 00000000 00000000 2040 1fe00 00000000 00000000 00000000 00000000 2044 1ff00 00000000 00000000 00000000 00000000 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk2g) lbn 128 SBM page 1--------------------------------------------------------------------------lgSqNm 0 xor ffda9d37 index block mapInt[] 2046 1ff80 2b7ff3ff ffffe3ff ffffffff fffbf7ff ***. ..*. .... .**. 2050 20080 e7f7ffff 0c0070ff 442fff00 c0000002 **.. * *. **. * * 2054 20180 0bfdfffc 1c07ffe0 ffbd60ff 3e11ffff **.* **.* .**. **..
(...)#
2-46 AdvFS On-Disk Structures
Defining the Storage Bitmap Bitfile and Miscellaneous Bitfile
Miscellaneous BitfileCharacteristics of the miscellaneous bitfile include:
• One per volume
• Holds pages for:
— Primary and secondary boot block
— Partition table (disk label)
— Fake UFS super block with AdvFS magic number
AdvFS On-Disk Structures 2-47
Summary
s and
f
e
ll can and
Summary
Introducing AdvFS On-Disk Structures AdvFS is built using a two-layer strategy separating file access support from file storage support. The two layers are:
• File access system (FAS)
• Bitfile access system (BAS)
Consider the .tags directory as a way to access the BAS from the context of the FAS. All files and metadata are accessible through .tags by using the file’s tag number or a special metadata file name.
All AdvFS on-disk structures can be accessed as bitfiles. This includes user filedirectories as well as the AdvFS metadata structures.
Bitfiles are arrays of 8K disk pages holding user data or metadata. A series ocontiguous 8K pages in a bitfile is stored as an extent
Each bitfile is identified by its tag, which consists of a tag number or sequencnumber pair. Use the tag number to locate the extents of a file.
Several metadata bitfiles (RBMT, BMT) have an internal page organization consisting of a page header and a series of mcell data structures. Each mcecontain a series of variable-length records describing various bitfile attributescharacteristics.
AdvFS files have means of identification: a tag
• Similar to the UFS inode number
• Can be discovered with the ls -i command
Bitfile-sets have the following characteristics:
• FAS fileset represents a BAS bitfile-set
• Identified by numbers
Tag numbers can be reused:
• With file creation and deletion.
• Like inode numbers.
2-48 AdvFS On-Disk Structures
Summary
ells extent
Describing BAS On-Disk Metadata BitfilesEach volume in an AdvFS domain consists of the following structures:
• Reserved bitfile metadata table
• Bitfile metadata table
• Storage bitmap
• Miscellaneous bitfile
Each domain is supported by the following bitfiles:
• The on-disk log
• The root tag file
Each volume is supported by the following bitfiles:
• Reserved bitfile metadata table (RBMT)
• Bitfile metadata table (BMT)
Bitfile metadata table (BMT) holds the support data for user files and directories. It contains location information, permissions and other stats, extent information, fragment location, and other descriptive data.
The BMT stores bitfile metadata, including:
• Bitfile attributes
• Bitfile extent maps
• Bitfile set attributes
• FAS file attributes including the POSIX file stats
These characteristics describe mcells and the records within them.
• Inodes of AdvFS are 28 fixed-size (292 byte) mcells packed into 8K pages
• One or more linked mcells describe bitfiles
• First mcell in list is primary mcell
• Each mcell contains variably sized records describing attributes of the bitfile
Using Extent MapsExtent maps are stored in mcells linked to a bitfile’s primary mcell. Since mcare of limited size, an extent map may span several mcells (the pieces of the map in different mcells are called subextent maps).
AdvFS On-Disk Structures 2-49
Summary
Using TagsTag files translate a bitfile tag to the location of its primary mcell. The file is indexed by tag number and the file entry contains the logical address of the primary mcell which is a tuple of the following format:
<volume index, BMT page number, mcell’s index within the BMT page>
A bitfile tag consists of a tag number and a sequence number. Whenever a bitfile is deleted, its tag is placed back on the free list. However, for various consistency reasons (like crash recovery) AdvFS cannot reuse the tag unless it is made unique from previous uses of that tag. So, each time a tag is reused, its sequence number is incremented to differentiate it from the previous use of the same tag. The sequence number has a limited number of bits, so a tag can be used 4096 times and then it becomes a dead tag, never to be reused again.
Assigning FragmentsStorage allocation in AdvFS is done in 8KB page units. For small files this can cause internal fragmentation. To solve this problem, AdvFS uses storage fragments that are 1KB to 7KB in size to store small files and the ends of files less than 100KB in size.
Fragments are allocated from the fragment bitfile, which is a metadata bitfile associated with each bitfile-set (it is always assigned tag 1). The basic structure of the fragment bitfile is a collection of fragment groups where each group contains a header and an array of fragments of a uniform size.
Defining the Storage Bitmap Bitfile and Miscellaneous BitfileEach AdvFS volume contains a storage bitmap that keeps track of and allocates disk space. In AdvFS terminology, a block is a 512-byte sector, a cluster is one more contiguous block, and a page is 16 blocks. Each bit in the storage bitmap represents a page. If the bit is set, the page is allocated to a bitfile; if it is clear, the page is free (available for allocation). The cluster size is definable on a volume basis, however AdvFS currently uses a cluster size of two blocks (1K byte) for all volumes.
The storage bitmap is structured as an array of 8KB pages where each page consists of an array of 32-bit integers.
2-50 AdvFS On-Disk Structures
Exercises
Exercises
The exercises in this chapter are preceded by a refresher or primer section. Please read the information carefully. It serves not only as a reminder of lecture information, it sometimes introduces new points.
Bitfiles and Tags Lab Refresher
The lowest level of the AdvFS is the bitfile access system. Here every file is a bitfile, a collection of 8192-byte pages. The higher level of AdvFS, or file access system, transforms bitfiles into normal UNIX files.
When tags are used for the first time, they are given a sequence number of 8001 hexadecimal or 19647 decimal. You will notice from the output of showfile that sequence numbers rarely get much greater than the initial 8001 value.
AdvFS file systems are also identified by tags. If you use the showfsets command on a file domain, you will see that the ID of a fileset is a sequence of four hexadecimal numbers, such as 319b7053.00092e01.2.8002.
The first two hexadecimal numbers, in this case, 319b7053.00092e01, identify the file domain. The last two hexadecimal numbers, here 2.8002, are tags that identify the fileset.
Every AdvFS file system has a .tags subdirectory that allows direct access, for the superuser, to bitfiles by tag number. A file in the /users file system with tag 1bb88.803a can be addressed using .tags through a wide variety of names including:
/users/.tags/0x1bb88 /users/.tags/113544/users/.tags/0x1bb88.0x803a
Exercise
1. Use the showfile and the ls -i commands to list the tag numbers of a few AdvFS files and then access the files through the appropriate .tags directory.
2. Use the tag2name command, located in /sbin/advfs, to translate an AdvFS tag into the corresponding file name.
BMT and RBMT Lab Refresher
The BMT is a bitfile that contains metadata call (mcell) records. Every virtual disk of the file has a BMT. The mcell records of the BMT contain almost all information that describes the files of the virtual disk plus additional information about the file domain and filesets. The RBMT contains the mcells for the BMT and other reserved bitfiles.
AdvFS On-Disk Structures 2-51
Exercises
One important use of the BMT is the storage of extent records that describe the pages used by all bitfiles of the disk.
Each 8192-byte page of the BMT is comprised of a 16-byte header followed by an array of twenty eight 292-byte mcells.
The BMT header starts with three fields used to track free mcells. It then contains a field giving the page’s number within the BMT bitfile and the version number of the advanced file system being used in this file domain. The present version number is 4.
Mcell Format Lab Refresher
Every bitfile has a primary mcell. The primary mcell is the beginning of a chain of mcells which describe the bitfile. Mcells are addressed by a 32-bit mcell ID, bfMCIdT, in which the first 27 bits give the mcell’s BMT page number and the remaining 5 bits give the mcell’s position within its BMT page. Tag files, described in another section, translate a bitfile’s tag number into a 16-bit disk index and a 32-bit mcell ID which points to the bitfile’s primary mcell.
The 292-byte mcell begins with three fields used to link the chain of mcells associated with a particular bitfile. The first field is the 32-bit ID of the next mcell in the chain and the second field is the disk containing that mcell. Some bitfiles, in particular striped files, will have mcells located on several disks. The third field gives the position of this mcell within the chain. A disk and mcell ID of zero, indicates the end of the mcell chain.
The remaining two header fields are a pair of tags that uniquely identify the bitfile. The second tag of the page names the bitfile set, or file system, in which the bitfile is contained. The first tag is the tag of the bitfile itself.
After these five header fields, 268 bytes remain within the mcell. These 268 bytes contain mcell records.
Reserved Mcells Lab Refresher
The primary mcells of certain important bitfiles must be located at fixed positions within the RBMT. Within RBMT page 0, there are seven reserved positions.
0 The RBMT itself
1 Storage bitmap; keeps up with free space on the disk
2 Root tag file; keeps up with fileset tag files
3 Transaction log bitfile; contains the AdvFS log
4 BMT; one of a chain of mcells associated with the BMT
5 Miscellaneous bitfile; contains the boot blocks and partition table
6 Information on volume and domain
2-52 AdvFS On-Disk Structures
Exercises
lly in only
S. her-here
d.
of
ur y
d ul.
.
n in es to
BMT page 0 has a reserved mcell at slot 0. This mcell is the head of the BMT’s list of free mcells. Every disk must contain a BMT, a storage bitmap, and a miscellaneous bitfile; however, only one disk of the file domain needs a root tag file and a transaction log bitfile.
The reserved bitfiles do have special tags. Take the index of the virtual disk on which the reserved bitfile is located, multiply that number by -6, and then subtract the mcell position found in the above table. For example, the storage bitmap of disk two has tag -13, 2*(-6)-1, and a transaction log on disk seven has tag -51, 7*(-6)-9. You can use these tags to access reserved bitfiles via the .tags directory. /usr/.tags/-22 (or M-22) would be the BMT of the second virtual disk of the file domain that contains /usr.
Mcell Records Lab Refresher
Mcell records vary in size and type. Every mcell record begins with a 4-byte header which gives the record’s size and type. Records are simply stored sequentiathe mcell. Some records are so large that they fill the entire mcell. Others are8 bytes long, including header.
The lower-numbered record types are associated with the BAS layer of AdvFThese describe bitfile information such as extents and fileset names. The hignumbered record types are associated with the FAS layer of AdvFS. This is winformation such as file modification times and symbolic link names are store
Exercise
Within the /sbin/advfs directory is the nvbmtpg program which prints out BMT records. Read the reference pages for nvbmtpg.
1. Either use the showfdmn program or generate a recursive directory listing /etc/fdmns to determine the name of a virtual disk (SCSI or LSM blockdevice) of an AdvFS file domain.
2. Use the nvbmtpg command to look at the first two pages of the RBMT on yovirtual disk. Save this information in a file. If a printer is available, you mawant to print out this information.
3. Use the nvbmtpg -c command to look at the mcell chains for the BMT anstorage bitmap. Write down the extent map of the BMT. You will find it usef
4. Use the showfile -x command with the BMT’s .tags directory entry as an argument to verify that you successfully completed the last exercise.
5. Try using nvbmtpg with the BMT's .tags directory entry as an argument. You must have a fileset of the file domain successfully mounted to do this
6. Verify that the data of your chosen files really is stored in the pages showthe extent map by reading the data through the raw device that AdvFS usstore the data. Here is one example of someone doing this exercise:
AdvFS On-Disk Structures 2-53
Exercises
$ showfdmn -k play_dmn
Id Date Created LogPgs Domain Name3269046b.000bc403 Sat Oct 19 12:40:11 1996 512 play_dmn
Vol 1K-Blks Free % Used Cmode Rblks Wblks Vol Name 1 1067152 402864 62% on 128 128 /dev/rz3c 2L 1055509 387696 63% on 128 128 /dev/rz2c ---------- ---------- ------ 2122661 790560 63%
$ showfile -x bmtmisc.c
Id Vol PgSz Pages XtntType Segs SegSz Log Perf File bf48.8002 1 16 1 simple ** ** off 100% bmtmisc.c
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 1 1 1601392 16 extentCnt: 1
# dd if=/dev/rrz3c of=temp ibs=512 iseek=1601392 count=16dd: 16+0 records in.dd: 16+0 records out.
7. Now create a striped file, at least 1 megabyte in size, and use nvbmtpg -c, and showfile to find its extent map. You must look into the BMTs of at least two virtual disks to obtain this information.
8. Create a file with a few large holes and then look up its extent map. An easy way to create a file with holes is:
# dd if=/vmunix of=holesome.dat oseek=500 count=100# dd if=/vmunix of=holesome.dat oseek=1000 count=100
9. Create a clone fileset. Use ls -i to verify that the tag numbers of the files in both the original and clone filesets are the same. Use nvbmtpg to verify that even the mcell IDs are unchanged.
10. Modify a large file in the original fileset by appending some blocks to the end of the file. (#cat some_file >> large_file) (or use dd with the oseek and count options). Now use ls -i and nvbmtpg to see that, while the tags are unchanged, the mcell IDs are now changed. Use showfile -x and nvbmtpg -c to look at the extent maps of both original and clone.
11. Use nvbmtpg -c to look at mcell number 0 on page 0 of the BMT for the first volume of a domain. Note that the mcell free list is minimal. This is part of the dynamics built into AdvFS for Tru64 UNIX V5.
12. Look at bitfile attributes and bitfile inheritable attributes for reserved files, nonreserved regular files, and finally nonreserved directories.
2-54 AdvFS On-Disk Structures
Exercises
13. Look at bitfile attributes for a clone file with a modified original and then look at the bitfile attributes for the original.
14. Print out the mcell records for an original and a cloned fileset.
15. This exercise is difficult. Try deleting a fileset with lots of large files. While the deletion is in progress, examine the fileset mcell record to look at the progress of the deletion through the delete pending change.
16. Convince yourself that executing the chfsets command really does result in modification of the appropriate fileset attributes record.
17. Examine the domain attribute and virtual disk records for your AdvFS disks.
POSIX File Information Lab Refresher
BMT record type 255 is file system stats BMTR_FS_STAT.
Every nonreserved file has the standard UNIX file characteristics, such as file permissions, file owner, file group, access time, and file size. These are stored in a single BMT record.
This is very similar to the information stored in an UFS inode; however, there are three additional items, a tag pointing back to the parent directory and two fields used to record fragment identification for small UNIX files.
struct fs_stat{ bfTagT st_ino; mode_t st_mode; uid_t st_uid; gid_t st_gid; dev_t st_rdev; off_t st_size; time_t st_atime; int st_uatime; time_t st_mtime; int st_umtime; time_t st_ctime; int st_uctime; uint_t st_flags; /* user defined flags for file */ bfTagT dir_tag; /* tag of parent directory */ bfFragIdT fragId; short st_nlink; short st_unused_1; /* pads out the 16-bit nlink field */ uint32T fragPageOffset; uint32T st_unused_2;};
The definition of this record is found in msfs/fs_dir.h .
BMT record type 254 Fast symbolic links BMTR_FS_DATA
AdvFS On-Disk Structures 2-55
Exercises
If an mcell corresponds to a symbolic link, the mode field of the POSIX file stat record is marked. If the name of the symbolic link target will fit into a BMT record, a special BMT record is created which contains the target name as its data value. Longer symbolic link names, which are very rare, are stored as file data.
1. Examine the BMT POSIX file stat records for a few of your files. See how the information stored in this record is reflected in the output of ls -l.
2. Now create some symbolic links and examine the corresponding BMT fast symbolic link records.
Trashcan Directories Lab Refresher
BMT record type 252.
1. Undelete directory BMTR_FS_UNDEL_DIR.
Trashcan directories have a simple, on-disk implementation. All that is created is a BMT record within the mcell chain of the "source" directory containing the tag of the trashcan. Since the BMT records of files point to their parent directories, it is not difficult to determine the appropriate trashcan for a file.
struct undel_dir_rec { bfTagT dir_tag;};
2. Use mktrashcan to create a trashcan directory and then examine the BMT record that points to the trash.
Time Lab Refresher
Keeping good time is supported by a BMT mcell record.
BMT record type 251 is file system time BMTR_FS_TIME.
This 4-byte record contains a value stored in UNIX standard time. It is found within the mcell chains for the root directories of file systems. For nonroot file systems, it is typically updated on file system updates. For the root file system, it is updated at regular file system synchronization time. When the system is rebooted, this BMT record is read from the root file system and compared with the time-of-day clock as a sanity check.
Find the mcell ID of a file system root and examine its BMT file system time records.
Tag File Lab Refresher
Like all bitfiles, the tag file is composed of 8192-byte pages. Each page consists of a 16-byte header followed by 1022, 8-byte tagmap entries.
2-56 AdvFS On-Disk Structures
Exercises
The 16 bytes of the header give the logical address of the page within the tag file, pointers to free tagmap entries and to pages with free entries, a count of allocated tagmap entries, and a count of dead tagmap entries.
There are actually three record formats for the 8-byte tagmap entries. The first entry on page 0 of the tag file isn’t really used to map tags to mcell IDs. Instead it contains the page addresses of the first tag file page with a free entry and the first tag file page that has not yet been initialized.
The second format for tagmap entries links the free entries of a page.Note that even though the entry is free, the sequence number is still maintained. The final format contains a real tag to mcell ID mapping:
Because tags can be reused when files are deleted and have their tags reclaimed, sequence numbers distinguish reincarnations of the tag. If a tag is in use, the first bit of the 16-bit sequence number is on. This is why the sequence numbers printed by showfile usually start with the hexadecimal digits 80. This leaves 15 bits for recording the real sequence number.
Given a tag number, it is not hard to find the appropriate tagmap entry. Divide the tag number by 1022 to find the appropriate tag file page. Use the remainder of that division to find the tagmap entry within the page. Just multiply that remainder by eight and add in 16 for the page header.
Two Types of Tag Files Lab Refresher
Every AdvFS file domain has a single root tag file that gives the location of the mcells associated with the filesets of the domain. The primary mcell of the root tag file is located in position 2 of page 0 of the RBMT. Only one virtual disk of the domain contains the true root tag file. The domain mutable attributes record of the RBMT fingers the real one. The root tag file is usually found at block 96 on the virtual disk, but an aggressive use of addvol and rmvol may cause it to move to another location.
Every AdvFS fileset has its own tag file. The fileset ID, found in the bitfile-set attributes record, is the tag for the fileset. Use this number as an index into the root tag file to find the mcell ID for the fileset itself. The extent map found in the fileset’s mcell chain gives the pages of the fileset’s own tag file. In general, the search goes in the other direction; the root tag file is searched to locate a particular fileset.
For convenience, the .tags directory has its own special naming convention for fileset tag files. Putting the letter M before the fileset ID gives the name of the tag file as found in the .tags directory. For example, if the mounted file system /playpen has fileset ID 5, /playpen/.tags/M5 is the name of /playpen’s tagfile.
AdvFS On-Disk Structures 2-57
Exercises
Exercise
1. Start by running showfsets on your AdvFS file domains so that you will know a few fileset IDs to use in the remaining exercises.
2. Use the nvtagpg program, located in /sbin/advfs, to list the root tag files of an AdvFS file domain.
3. Select a target file and use both showfile and ls -i to obtain its tag number. The reason for using two programs is that one prints the tag number in decimal and other prints the sequence number.
4. Divide the tag number by 1022 and write down both the quotient and remainder. The quotient determines the page number containing the appropriate tagmap entry while the remainder determines the position within the page. (The previous calculation was more useful in V4 than in V5 of Tru64 UNIX.) Now use nvtagpg to get the tagfile page entry.
If the sequence numbers of nvtagpg and showfile do no’t match, you do not have the right tagmap entry. You may need to convert from hexadecimal to decimal to verify the match.
5. Use the showfile -x command on the .tags M file for a fileset to determine the extent map of a fileset’s tag file.
Directory Lab Refresher
Now look at the higher-level POSIX directories. AdvFS is designed to use almost the same format for directory files as UFS. The only difference is that AdvFS uses the padding of the directory entry to store the 8-byte file tag. One advantage of this is that those unreformed programs that read UFS directory files, rather than use the getdirentries(2) system call, to determine the files of a directory, will work equally well, or poorly, on UFS and AdvFS file systems.
AdvFS has added some additional support for very large directories. The performance improvements include the creation of a B-tree index supporting directories that are more than 8K in size. This dramatically improves file creation and deletion performance. Improvement becomes more noticeable when the directory contains more than ~2500 files.
Each 8192-byte directory page ends with a 12-byte directory record that has two fields for tracking free directory entries and one field for the page type, presently always a one-meaning sequential directory.
The remaining space within the directory page is occupied by directory entries. In fact, the entire directory page is occupied by directory entries, because the 12-byte directory record is stored within the padding of a directory entry.
2-58 AdvFS On-Disk Structures
Exercises
Every file within a directory has its own directory entry. The AdvFS directory entry has five fields and two places where padding may be inserted. The first three fields are considered the directory header. The first field, 4 bytes in size, gives the tag number of the file. In a UFS directory entry, this is where the inode number is stored. The next two fields, each 2 bytes long, give the size of the directory entry and the size of the file name.
The fourth field of the directory entry is the file name.
In AdvFS there is a fifth field, an 8-byte tag consisting of tag number and sequence. Up to 4 bytes of padding may precede the tag to ensure that the tag is stored on a 4-byte boundary. More padding may follow the tag to fill out the entire directory entry.
The reason why there may be up to 4 bytes of padding before the tag is that the file name field is always followed by at least one null character.
Directory entries should never cross sector, 512-byte, boundaries. For this reason, you often see that directory entries near the end of a sector will have a generous allotment of padding. This enables them to fill the sector.
A directory entry with zero in its first 4 bytes, the location of the tag number in AdvFS and the inode number in UFS, is considered empty. When a directory is created, it is given two directory entries for . and .. which are placed at the beginning of the directory file. The remainder of the directory file is filled with empty sector-sized directory entries.
Exercise
1. Create a directory and connect into it. Now look at the directory file by typing the following commands:
# vfilepg -r domain_name fileset_name directory/spec -f d
# od -A x -a -h -H . | more
Notice the entries for . and .. along with all the empty directory entries.
2. Use the touch command to create five files, i, ii, iii, iv, and v, within your new directory. Use od and vfilepg to examine the directory file.
3. Remove the file iii. Use od to determine what happens to the directory entry for iii.
4. Now remove ii. Notice how the old directory entries for ii and iii have been merged.
5. You may have noticed that the tags for ii and iii continue to reside in the directory file and may be wondering about the possibilities for file undeletion. Return to reality by remembering what happens to free tagmap entries.
AdvFS On-Disk Structures 2-59
Exercises
6. Create, via touch, a file vii and notice where its directory entry is placed.
7. Create several files with very, very long names and see how the creation of directory entries avoids crossing the sector boundaries.
8. Do a showfile -i command on an 8K directory file. Make a larger directory file. What does showfile -i indicate on the larger directory?
Fragment File Lab Refresher
The bitfile access system of AdvFS only allocates disk space in 8192-byte pages. Since there are some rather clear inefficiencies to storing 117-byte or even 9000- byte files in 8192-byte pages, AdvFS stores some files in an integral number (possibly zero) of 8192-byte pages followed by a fragment of 1024 to 7168 bytes.
To convince yourself that something unusual really is happening, execute the following commands:
$ dd if=/etc/disktab of=frag.file$ ls -l frag.file$ showfile -x frag.file
You have already encountered mention of fragments in two BMT records: the bitfile-set attributes record contained an array of eight fragment group headers, and the POSIX file stats record contained a fragment ID and page offset.
If you were paying careful attention to the output of nvtagpg, you may have noticed something else unusual; every fileset has a bitfile with tag number 1. The number one bitfile of a fileset is the fragment bitfile. If an FAS-level or POSIX-level file has its final bytes stored in a fragment, those bytes are stored inside the fragment bitfile.
Finding a File’s Fragment
The fragID field of the POSIX file stats record is used to record the position of a file’s fragment. This field is actually a two-part record: fragId.frag is the offset, in 1024-byte blocks, of the file’s fragment within the fileset’s fragment bitfile, and fragId.type is the number of 1024-byte blocks allocated to this fragment.
Two other fields of the file stats records are also involved with fragment management. The fragPageOffset field records the logical page address of the fragment within the fragment bitfile. Unless the file has holes, this will equal the number of full pages allocated to the file. One other field, st_size, is needed to determine just exactly how many bytes of the fragment are used by the file. Take the remainder of dividing this field by 8192, and you have the number of bytes of the fragment that contain real file data.
If the fragId.type field is zero, the file has no fragments.
2-60 AdvFS On-Disk Structures
Exercises
Exercise
1. Execute the ls -l command on some fragment bitfiles. Remember that .tags/1 gets you to a fileset’s fragment bitfile.
2. Copy some randomly sized file, say /etc/disktab, onto your AdvFS file system. Now use nvbmtpg to find out where your file’s fragment resides within the fragment bitfile.
3. Now use dd to copy that fragment directly out of the fragment bitfile. You’ll use a command similar to:
# dd if=/playpen/.tags/1 of=/tmp/copy ibs=1024 iseek=76275 count=3
Managing the Fragment Bitfile
Concentrate on the on-disk structure of the fragment bitfile. Of course, the fragment file is comprised of 8192-byte pages. However, these pages are managed as a collection of 16 pages, or 128 kilobyte, fragment groups. Within a group, all allocated fragments are of the same size, that is, the same number of 1024-byte blocks. Consequently, fragment groups fall into one of eight different types; one type for each of the seven fragment sizes, and an eighth type for fragment groups which have no allocated fragments.
Fragment Group Header
Each fragment group begins with a header. The first two fields of the header are not used in the current version of AdvFS. The third field is used to link together all the fragment groups of the same type. The fourth field is the page number of the fragment group header and a good place to verify that you are looking at a group header page. A type indicator, count of free fragments, and the fileset ID occupy the next three fields. The seventh field is the version number of the fragment implementation. Presently, we are on version 1. The eighth field is the address, in 1024-byte blocks relative to the beginning of the fragment bitfile, of the first fragment within this group. The remaining fields are used as a free list for the group’s fragments.
Recall that all the fragments within a group are of the same size. This is even the case for the fragment number zero which contains the fragment group header followed by a lot of unused space. The free list is an array which has one element per fragment. Each element of the free list which corresponds to a free fragment points to the next free fragment. Array element 0 of the free list is either -1, which indicates that the group has no free fragment, or the address of the first free fragment. To find the remaining free fragments, work your way through the free list until you encounter a -1. MS-DOS gurus should think about the file access table.
AdvFS On-Disk Structures 2-61
Exercises
Here’s the definition of the group header taken from the kernel source file msfs/msfs/bs_bitfile_sets.h:
typedef struct grpHdr { uint32T nextFreeFrag; /* frag index (valid only when "version == 0") */ uint32T lastFreeFrag; /* frag index (valid only when "version == 0") */ uint32T nextFreeGrp; /* page number */ uint32T self; /* this group’s starting page number */ bfFragT fragType; /* type of frags in this group */ int freeFrags; /* number of free frags in the group */ bfSetIdT setId; /* bitfile-set’s ID */ /* * the following fields were added in ADVFS v3.0 * they were all zeros in pre-ADVFS v3.0 */ unsigned int version; /* metadata version pre-ADVFS v1.0 == 0, ADVFS v3.0 == 1 */ uint32T firstFrag; /* frag index */ /* * the following is used a map of the free frags in the group. * it a linked list where element zero (0) is used as the head * of the list (since frag 0 is always the group header it can * never be allocated so element zero would otherwise be unused) */ unsigned short freeList[ BF_FRAG_GRP_SLOTS ];} grpHdrT;
Finding Fragment Group Headers
The remaining mystery is the location of the group headers themselves. The addresses, in 8192-byte pages, of the first and last group headers for each of the eight fragment types are found in the last field, fragGrps of the bitfile-set attributes record for the fileset. This record is found in the BMT chain for the fileset’s tagfile.
Exercise
1. The program nvfragpg, found in /sbin/advfs, prints various interesting statistics about fragment usage within the eight different fragment groups. Read the reference page for this command and then apply it to each of your AdvFS filesets.
2. Use the nvtagpg command to find the mcell IDs of some AdvFS bitfile-sets. Now use nvfragpg to print out the addresses of the fragment group headers for these filesets.
3. The nvfragpg program, also located in /sbin/advfs, will print out a list of the free fragments found within a fragment group along with the address of the next group of that type.
2-62 AdvFS On-Disk Structures
Exercises
e each
tag e ze of
ernel is
nd
You ill
vious tual
SBM Lab Refresher
Every AdvFS virtual disk has a storage bitmap bitfile which tracks free and used disk blocks. There is little more than an array of bits, one for each 8K bytes of disk storage.
SBM Page Format
Each SBM page begins with a two-field header. The first field once stored a log sequence number but is presently always set to zero. The second field is a 32-bit exclusive, or parity, of all the remaining 32-bit words in the SBM page. The remaining 2046 words are a bitmap for 65,472 8K pages of the virtual disk. If a cluster is allocated, its bit is set. If a cluster is free, its bit is clear. Since AdvFS always allocates disk storage in pages rather the clusters, it’s easier to imaginpage of the SBM as having 8184 bytes corresponding to pages.
The algorithm for determining if a page of the disk is free is simple. Take thenumber and divide by 8184. The quotient points you to an SBM page, and thremainder points you to a byte within that page after you add in eight for the sithe header.
Obviously, using this sort of representation to manage free storage inside the kwould be very inefficient. We'll soon see the kernel's in-core structure, whichvery different from the on-disk structure.
Exercise
1. Start out by running od -x on one of your storage bitmap files. The commasyntax will be something like:
# od -x -N 1024 /usr/.tags/-7
2. Repeat the previous exercise, but this time use the virtual disk interface. must use showfile -x to find the extent map for the storage bitmap. It wlook similar to:
# od -x -j 112b -N 1024 /dev/disk/dsk3c
3. Here is how to determine if page 17000 of an AdvFS virtual disk is free:
# expr 17000 / 8184 \* 8192 + 17000 % 8184 + 8
17024
# od -x -j 17024 -N 1 /usr/.tags/-7
4. Tru64 UNIX V5 supplies a much more convenient command, vsbmpg. See the reference page for this command. Accomplish the same result as the preexercise without all the arithmetic. Find out if page 50000 of one of your virdisks is free.
AdvFS On-Disk Structures 2-63
Exercises
Miscellaneous Bitfile Lab Refresher
Every page on an AdvFS virtual disk is either free or assigned to a bitfile. To satisfy this requirement, several data blocks are put into special miscellaneous bitfiles. These data blocks consist of the disk pages containing the partition table, the primary and secondary boot blocks, and the AdvFS magic number.
Exercise
Use showfile to see the extents of your miscellaneous bitfile. Find them at .tags/-11, and so forth.
2-64 AdvFS On-Disk Structures
Solutions
Solutions
1. Use the showfile and the ls -i commands to list the tag numbers of a few AdvFS files and then access the files through the appropriate .tags directory.
#
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025668 294688 78% /usr
usr_domain#var 1426112 75822 294688 21% /var
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
bruden_dom#dennis_fset 2251840 91118 2082800 5% /usr/dennis
#
# cd /usr/dennis
#
# ls -li
total 45567
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
11 drwxrwxrwx 2 root system 8192 Sep 28 17:38 den_trash
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 31114 Sep 28 17:32 sm1
12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1
#
# ls -li big3
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
#
# showfile -x big3
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
8.8001 2 16 1422 simple ** ** async 100% big3
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1422 2 75504 22752
extentCnt: 1
#
#
# ls -li ./.tags/8
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 ./.tags/8
#
#
# showfile -x .tags/8
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
AdvFS On-Disk Structures 2-65
Solutions
8.8001 2 16 1422 simple ** ** async 100% 8
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1422 2 75504 22752
extentCnt: 1
#
2. Use the tag2name command, located in /sbin/advfs, to translate an AdvFS tag into the corresponding file name.
#
# tag2name .tags/8
/usr/dennis/big3
#
#
#
# tag2name -r bruden_dom dennis_fset 8
big3
#
3. Read the reference pages for nvbmtpg.
# man nvbmtpg
(...)
4. Either use the showfdmn program or generate a recursive directory listing of /etc/fdmns to determine the name of a virtual disk (SCSI or LSM block device) of an AdvFS file domain.
# ls -lR /etc/fdmns
total 4
-r-------- 1 root system 0 Sep 28 16:59 .advfslock_bruden_dom
-r-------- 1 root system 0 Sep 15 00:38 .advfslock_domain_dsk0g
-r-------- 1 root system 0 Sep 14 16:29 .advfslock_domain_dsk2g
-r-------- 1 root system 0 Sep 11 11:08 .advfslock_fdmns
-r-------- 1 root system 0 Sep 11 11:08 .advfslock_usr_domain
drwxr-xr-x 2 root system 512 Sep 28 17:08 bruden_dom
drwxr-xr-x 2 root system 512 Sep 14 16:30 domain_dsk0g
drwxr-xr-x 2 root system 512 Sep 14 16:37 domain_dsk2g
drwxr-xr-x 2 root system 512 Sep 11 11:08 usr_domain
/etc/fdmns/bruden_dom:
total 0
lrwxr-xr-x 1 root system 15 Sep 28 16:59 dsk0a -> /dev/disk/dsk0a
lrwxr-xr-x 1 root system 15 Sep 28 17:01 dsk0b -> /dev/disk/dsk0b
lrwxr-xr-x 1 root system 15 Sep 28 17:08 dsk2h -> /dev/disk/dsk2h
/etc/fdmns/domain_dsk0g:
total 0
lrwxr-xr-x 1 root system 15 Sep 14 16:30 dsk0g -> /dev/disk/dsk0g
/etc/fdmns/domain_dsk2g:
2-66 AdvFS On-Disk Structures
Solutions
total 0
lrwxrwxrwx 1 root system 15 Sep 14 16:37 dsk0g -> /dev/disk/dsk0g
lrwxr-xr-x 1 root system 15 Sep 14 16:28 dsk2g -> /dev/disk/dsk2g
/etc/fdmns/usr_domain:
total 0
lrwxr-xr-x 1 root system 15 Sep 11 11:08 dsk1g -> /dev/disk/dsk1g
5. Use the nvbmtpg command to look at the first two pages of the RBMT on your virtual disk. Save this information in a file. If a printer is available, you may want to print out this information.
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 65568 50% on 256 256 /dev/disk/dsk0a
2 262144 215648 18% on 256 256 /dev/disk/dsk0b
3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2082800 8%
#
# nvbmtpg -rR bruden_dom 1 0
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0
--------------------------------------------------------------------------
CELL 0 next mcell volume page cell 1 0 6 bfSetTag,tag -2,-6(RBMT)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 32 (0x20)
bsXA[ 1] bsPage 1 vdBlk -1
--------------------------------------------------------------------------
CELL 1 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-7 (SBM)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 112 (0x70)
bsXA[ 1] bsPage 1 vdBlk -1
AdvFS On-Disk Structures 2-67
Solutions
--------------------------------------------------------------------------
CELL 2 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-8 (TAG)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 96 (0x60)
bsXA[ 1] bsPage 1 vdBlk -1
--------------------------------------------------------------------------
CELL 3 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-9 (LOG)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 128 (0x80)
bsXA[ 1] bsPage 512 vdBlk -1
--------------------------------------------------------------------------
CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-10 (BMT)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 48 (0x30)
bsXA[ 1] bsPage 1 vdBlk -1
--------------------------------------------------------------------------
CELL 5 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-11 (Misc)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 160 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 0 xCnt 3
bsXA[ 0] bsPage 0 vdBlk 0 (0x0)
bsXA[ 1] bsPage 2 vdBlk 64 (0x40)
bsXA[ 2] bsPage 4 vdBlk -1
--------------------------------------------------------------------------
CELL 6 next mcell volume page cell 1 0 27 bfSetTag,tag -2,-6(RBMT)
RECORD 0 bCnt 40 BSR_VD_ATTR
2-68 AdvFS On-Disk Structures
Solutions
vdMntId 37f2025a.0008c86b (Wed Sep 29 08:13:14 1999)
vdIndex 1 vdBlkCnt 131072
RECORD 1 bCnt 24 BSR_DMN_ATTR
bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)
RECORD 2 bCnt 52 BSR_DMN_MATTR
uid 0 gid 1 mode 0744
vdCnt 3
RECORD 3 bCnt 20 BSR_DMN_TRANS_ATTR
6. Use the nvbmtpg -c command to look at the mcell chains for the BMT and storage bitmap. Write down the extent map of the BMT. You’ll find it useful.
# nvbmtpg -rR bruden_dom 1 0 4 -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0
--------------------------------------------------------------------------
CELL 4 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-10 (BMT)
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 48 (0x30)
bsXA[ 1] bsPage 1 vdBlk -1
7. Use the showfile -x command with the BMT’s .tags directory entry as an argument to verify that you successfully completed the last exercise.
# showfile -x /usr/dennis/.tags/M-10
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
fffffff6.0000 1 16 1 simple ** ** ftx 100% M-10
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 1 1 48 16
extentCnt:
8. Try using nvbmtpg with the BMT’s .tags directory entry as an argument. You must have a fileset of the file domain successfully mounted to do this.
# nvbmtpg -r /usr/dennis/.tags/M-10
==========================================================================
FILE "/usr/dennis/.tags/M-10" RBMT page 0
--------------------------------------------------------------------------
There is 1 page in the BMT in this file.
The BMT uses 1 extents (out of 1) in 1 mcell.
#
AdvFS On-Disk Structures 2-69
Solutions
9. Verify that the data of your chosen files really is stored in the pages shown in the extent map by reading the data through the raw device that AdvFS uses to store the data. Here is one example of someone performing this exercise:
$ showfdmn -k play_dmn
Id Date Created LogPgs Domain Name3269046b.000bc403 Sat Oct 19 12:40:11 1996 512 play_dmn
Vol 1K-Blks Free % Used Cmode Rblks Wblks Vol Name 1 1067152 402864 62% on 128 128 /dev/rz3c 2L 1055509 387696 63% on 128 128 /dev/rz2c ---------- ---------- ------ 2122661 790560 63%
$ showfile -x bmtmisc.c
Id Vol PgSz Pages XtntType Segs SegSz Log Perf File bf48.8002 1 16 1 simple ** ** off 100% bmtmisc.c
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 1 1 1601392 16 extentCnt: 1
# dd if=/dev/rrz3c of=temp ibs=512 iseek=1601392 count=16dd: 16+0 records in.dd: 16+0 records out.
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 65568 50% on 256 256 /dev/disk/dsk0a
2 262144 215648 18% on 256 256 /dev/disk/dsk0b
3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2082800 8%
#
#
# showfile -x sm1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
a.8001 2 16 3 simple ** ** async 100% sm1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 3 2 121328 48
extentCnt: 1
#
2-70 AdvFS On-Disk Structures
Solutions
# dd if=/dev/rdisk/dsk0b of=/tmp/chunk1 ibs=512 iseek=121328 count=1
1+0 records in
1+0 records out
#
#
# cat /tmp/chunk1
#
# *****************************************************************
# * *
# * Copyright (c) Digital Equipment Corporation, 1991, 1999 *
# * *
# * All Rights Reserved. Unpublished rights reserved under *
# * the copyright laws of the United States. *
# * *
# * The software contained on t# #
#
# pg sm1
#
# *****************************************************************
# * *
# * Copyright (c) Digital Equipment Corporation, 1991, 1999 *
# * *
# * All Rights Reserved. Unpublished rights reserved under *
# * the copyright laws of the United States. *
# * *
# * The software contained on this media is proprietary to *
# * and embodies the confidential technology of Digital *
# * Equipment Corporation. Possession, use, duplication or *
# * dissemination of the software and media is authorized only *
# * pursuant to a valid written license from Digital Equipment *
# * Corporation. *
# * *
# * RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure *
# * by the U.S. Government is subject to restrictions as set *
# * forth in Subparagraph (c)(1)(ii) of DFARS 252.227-7013, *
# * or in FAR 52.227-19, as applicable. *
# * *
# *****************************************************************
# HISTORY
10. Now create a striped file, at least 1 megabyte in size, and use nvbmtpg -c, and showfile to find its extent map. You must look into the BMTs of at least two virtual disks to obtain this information.
# showfile -x stripe1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
c.8001 2 16 1422 stripe 2 8 async 100% stripe1
extentMap: 1
pageOff pageCnt volIndex volBlock blockCnt
0 8 3 1093536 11392
16 8
32 8
AdvFS On-Disk Structures 2-71
Solutions
(…)
1392 8
1408 8
extentCnt: 1
extentMap: 2
pageOff pageCnt volIndex volBlock blockCnt
8 8 1 80704 11360
24 8
40 8
(…)
1400 8
1416 6
extentCnt: 1
#
# ls -li stripe1
12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1
#
#
# nvbmtpg -r bruden_dom dennis_fset -t 12 -c
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 12 next mcell volume page cell 0 0 0 bfSetTag,tag 2,12
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_STRIPE chain mcell volume page cell 3 0 8
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 11646960
st_nlink 1 dir_tag 2 st_mtime Tue Sep 28 17:41:55 1999
Extent mcells from BSR_XTNTS record chain pointer.
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 8 next mcell volume page cell 1 0 18 bfSetTag,tag 2,12
RECORD 0 bCnt 260 BSR_SHADOW_XTNTS
allocVdIndex 3 mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 1093536 (0x10afa0)
bsXA[ 1] bsPage 712 vdBlk -1
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
2-72 AdvFS On-Disk Structures
Solutions
CELL 18 next mcell volume page cell 0 0 0 bfSetTag,tag 2,12
RECORD 0 bCnt 260 BSR_SHADOW_XTNTS
allocVdIndex 1 mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 80704 (0x13b40)
bsXA[ 1] bsPage 710 vdBlk -1
#
11. Create a file with a few large holes and then look up its extent map. An easy way to create a file with holes is:
# dd if=/vmunix of=holesome.dat oseek=500 count=100# dd if=/vmunix of=holesome.dat oseek=1000 count=100
# dd if=/vmunix of=holesome.dat oseek=500 count=100
100+0 records in
100+0 records out
#
# dd if=/vmunix of=holesome.dat oseek=1000 count=100
100+0 records in
100+0 records out
#
# showfile -x holesome.dat
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
9.8002 1 16 14 simple ** ** async 33% holesome.dat
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
31 7 1 57600 112
62 1 1 57744 16
63 6 1 80512 96
extentCnt: 3
#
#
# nvbmtpg -r bruden_dom dennis_fset -t 9 -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 17 next mcell volume page cell 0 0 0 bfSetTag,tag 2,9
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 1 0 16
firstXtnt mcellCnt 2 xCnt 1
bsXA[ 0] bsPage 0 vdBlk -1
AdvFS On-Disk Structures 2-73
Solutions
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 563200
st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:17:19 1999
Extent mcells from BSR_XTNTS record chain pointer.
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 16 next mcell volume page cell 0 0 0 bfSetTag,tag 2,9
RECORD 0 bCnt 264 BSR_XTRA_XTNTS
xCnt 6
bsXA[ 0] bsPage 0 vdBlk -1
bsXA[ 1] bsPage 31 vdBlk 57600 (0xe100)
bsXA[ 2] bsPage 38 vdBlk -1
bsXA[ 3] bsPage 62 vdBlk 57744 (0xe190)
bsXA[ 4] bsPage 63 vdBlk 80512 (0x13a80)
bsXA[ 5] bsPage 69 vdBlk -1
12. Create a clone fileset. Use ls -i to verify that the tag numbers of the files in both the original and clone filesets are the same. Use nvbmtpg to verify that even the mcell IDs are unchanged.
# clonefset bruden_dom dennis_fset den_clone
#
#
# showfdmn bruden_dom
Id Date Created LogPgs Version Domain Name
37f12c39.000263ea Tue Sep 28 16:59:37 1999 512 4 bruden_dom
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 131072 64000 51% on 256 256 /dev/disk/dsk0a
2 262144 215584 18% on 256 256 /dev/disk/dsk0b
3 1858624 1801584 3% on 256 256 /dev/disk/dsk2h
---------- ---------- ------
2251840 2081168 8%
#
#
# showfsets bruden_dom
bruce_fset
Id : 37f12c39.000263ea.1.8001
Files : 6, SLim= 0, HLim= 0
Blocks (512) : 68288, SLim= 50000, HLim= 60000 grc= none
Quota Status : user=off group=off
dennis_fset
Id : 37f12c39.000263ea.2.8001
2-74 AdvFS On-Disk Structures
Solutions
Clone is : den_clone
Files : 10, SLim= 0, HLim= 0
Blocks (512) : 92618, SLim= 0, HLim= 0
Quota Status : user=off group=off
den_clone
Id : 37f12c39.000263ea.3.8003
Clone of : dennis_fset
Revision : 3
#
# mount bruden_dom#den_clone /usr/den_clone
#
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025780 294384 78% /usr
usr_domain#var 1426112 76026 294384 21% /var
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
bruden_dom#dennis_fset 2251840 92618 2081168 5% /usr/dennis
bruden_dom#den_clone 2251840 92618 2081168 5% /usr/den_clone
#
# ls -li /usr/dennis
total 45709
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
11 drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash
9 -rw-r--r-- 1 root system 563200 Sep 29 09:17 holesome.dat
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 sm1
12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1
#
# ls -li /usr/den_clone
total 45709
3 drwx------ 2 root system 8192 Sep 28 17:04 .tags
6 -rwxr-xr-x 1 root system 11646960 Sep 28 17:09 big1
7 -rwxr-xr-x 1 root system 11646960 Sep 28 17:11 big2
8 -rwxr-xr-x 1 root system 11646960 Sep 28 17:12 big3
11 drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash
9 -rw-r--r-- 1 root system 563200 Sep 29 09:17 holesome.dat
5 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group
4 -rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user
10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 sm1
12 -rw-r--r-- 1 root system 11646960 Sep 28 17:41 stripe1
#
# nvbmtpg -r bruden_dom dennis_fset -t 10 -c
AdvFS On-Disk Structures 2-75
Solutions
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 2 0 15
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 62228
st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:35:34 1999
fragId.type BF_FRAG_5K fragId.frag 129
Extent mcells from BSR_XTNTS record chain pointer.
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 15 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10
RECORD 0 bCnt 264 BSR_XTRA_XTNTS
xCnt 3
bsXA[ 0] bsPage 3 vdBlk 75488 (0x126e0)
bsXA[ 1] bsPage 4 vdBlk 98256 (0x17fd0)
bsXA[ 2] bsPage 7 vdBlk -1
#
#
#
# nvbmtpg -r bruden_dom den_clone -t 10 -c
This clone file has no metadata.
13. Modify a large file in the original fileset by appending some blocks to the end of the file. (#cat some_file >> large_file) (or use dd instead with the oseek and count options). Now use ls -i and nvbmtpg to see that, while the tags are unchanged, the mcell IDs are now changed. Use showfile -x and nvbmtpg -c to look at the extent maps of both original and clone.
# cat /etc/disktab >> /usr/dennis/sm1
#
# ls -li /usr/dennis/sm1
10 -rw-r--r-- 1 root system 93342 Sep 29 10:39 /usr/dennis/sm1
#
# ls -li /usr/den_clone/sm1
2-76 AdvFS On-Disk Structures
Solutions
10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 /usr/den_clone/sm1
#
# showfile -x /usr/dennis/sm1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
a.8001 2 16 11 simple ** ** async 25% sm1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 3 2 121328 48
3 1 2 75488 16
4 3 2 98256 48
7 4 2 75360 64
extentCnt: 4
#
# showfile -x /usr/den_clone/sm1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
a.8001 3 16 0 simple ** ** async 100% sm1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
extentCnt: 0
#
# nvbmtpg -r bruden_dom den_clone -t 10 -c
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 5 next mcell volume page cell 0 0 0 bfSetTag,tag 3,10
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 0 0 0
firstXtnt mcellCnt 1 xCnt 1
bsXA[ 0] bsPage 0 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 62228
st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 09:35:34 1999
fragId.type BF_FRAG_5K fragId.frag 129
#
# nvbmtpg -r bruden_dom dennis_fset -t 10 -c
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
AdvFS On-Disk Structures 2-77
Solutions
--------------------------------------------------------------------------
CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 2 0 15
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 93342
st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 10:39:23 1999
fragId.type BF_FRAG_4K fragId.frag 260
Extent mcells from BSR_XTNTS record chain pointer.
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 15 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10
RECORD 0 bCnt 264 BSR_XTRA_XTNTS
xCnt 4
bsXA[ 0] bsPage 3 vdBlk 75488 (0x126e0)
bsXA[ 1] bsPage 4 vdBlk 98256 (0x17fd0)
bsXA[ 2] bsPage 7 vdBlk 75360 (0x12660)
bsXA[ 3] bsPage 11 vdBlk -1
14. Use nvbmtpg -c to look at mcell number 0 on page 0 of the BMT for the first volume of a domain. Note that the mcell free list is minimal. This is part of the dynamics built into AdvFS for Tru64 UNIX V5.
# nvbmtpg -r bruden_dom 3 0 0 -c
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 0 next mcell volume page cell 0 0 0 bfSetTag,tag 0,0
RECORD 0 bCnt 8 BSR_MCELL_FREE_LIST
headPg 0
RECORD 1 bCnt 12 BSR_DEF_DEL_MCELL_LIST
nextMCId page,cell 0,0 prevMCId page,cell 0,0
15. Look at bitfile attributes and bitfile inheritable attributes for reserved files, nonreserved regular files, and finally nonreserved directories.
2-78 AdvFS On-Disk Structures
Solutions
#
# nvbmtpg -rv bruden_dom dennis_fset -t 1
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18
--------------------------------------------------------------------------
CELL 4 linkSegment 0 bfSetTag 2 (2.8001) tag 1 (1.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 2 0 17
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121392 (0x1da30)
bsXA[ 1] bsPage 48 vdBlk -1
#
# nvbmtpg -rv bruden_dom dennis_fset -t 10
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18
--------------------------------------------------------------------------
CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 940
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
AdvFS On-Disk Structures 2-79
Solutions
type BSXMT_APPEND (0)
chain mcell volume page cell 2 0 15
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000
st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000
st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000
fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15
dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0
#
# nvbmtpg -rv bruden_dom dennis_fset -t 14
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
--------------------------------------------------------------------------
CELL 22 linkSegment 0 bfSetTag 2 (2.8001) tag 14 (e.8001)
next mcell volume page cell 1 0 23
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 269
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 92080 (0x167b0)
bsXA[ 1] bsPage 1 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)
dataSafety BFD_NIL
reqServices 1
optServices 0
2-80 AdvFS On-Disk Structures
Solutions
extendSize 0
clientArea 0 0 0 0
rsvd1 0
rsvd2 0
rsvd_sec1 0
rsvd_sec2 0
rsvd_sec3 0
16. Look at bitfile attributes for a clone file with a modified original and then look at the bitfile attributes for the original.
# ls -li /usr/dennis/sm1
10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 /usr/dennis/sm1
#
# ls -li /usr/den_clone/sm1
10 -rw-r--r-- 1 root system 62228 Sep 29 09:35 /usr/den_clone/sm1
#
# nvbmtpg -rv bruden_dom den_clone -t 10
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 19 nextFreePg -1 nextfreeMCId page,cell 0,10
--------------------------------------------------------------------------
CELL 5 linkSegment 0 bfSetTag 3 (3.8003) tag 10 (a.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 234
cloneId 3 cloneCnt 0 maxClonePgs 7
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 1
bsXA[ 0] bsPage 0 vdBlk -1
bsXA[ 1] bsPage 1 vdBlk -1
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 62228
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 09:35:34 1999 st_umtime 250400000
st_atime Wed Sep 29 09:02:24 1999 st_uatime 451571000
AdvFS On-Disk Structures 2-81
Solutions
st_ctime Wed Sep 29 09:35:34 1999 st_uctime 250400000
fragId.frag 129 fragId.type 5 BF_FRAG_5K fragPageOffset 7
dir_tag 2 (2.8001) st_flags 0 st_unused_1 458752 st_unused_2 0
#
# nvbmtpg -rv bruden_dom dennis_fset -t 10
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18
--------------------------------------------------------------------------
CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 940
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 2 0 15
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000
st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000
st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000
fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15
dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0
17. Print out the mcell records for an original and a cloned fileset.
# nvbmtpg -rv bruden_dom dennis_fset -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
2-82 AdvFS On-Disk Structures
Solutions
--------------------------------------------------------------------------
CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 1 0 7
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 0 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)
bsXA[ 1] bsPage 8 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)
blkHLimitHi,blkHLimitLo 0,30d40 (200000)
blkSLimitHi,blkSLimitLo 0,0 (0)
fileHLimitHi,fileHLimitLo 0,0 (0)
fileSLimitHi,fileSLimitLo 0,0 (0)
blkTLimit 0, fileTLimit 0, quotaStatus 1420
unused1 0, unused2 0, unused3 0, unused4 0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)
bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)
bfSetId.dirTag 2 (2.8001)
fragBfTag 1 (1.8001)
nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)
nxtDelPendingBfSet 0 (0.0)
state BFS_READY flags 0x0
cloneId 0 cloneCnt 3 numClones 1
fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0
uid 0 gid 1 mode 0744 setName "dennis_fset"
fsContext[0], fsContext[1] 2.8001 (rootTag)
fsContext[2], fsContext[3] 3.8001 (tagsTag)
fsContext[4], fsContext[5] 4.8001 (userQuotaTag)
fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)
fragGrps[0] firstFreeGrp 64 lastFreeGrp 32
fragGrps[1] firstFreeGrp -1 lastFreeGrp -1
fragGrps[2] firstFreeGrp 48 lastFreeGrp 48
fragGrps[3] firstFreeGrp -1 lastFreeGrp -1
AdvFS On-Disk Structures 2-83
Solutions
fragGrps[4] firstFreeGrp 32 lastFreeGrp 32
fragGrps[5] firstFreeGrp 16 lastFreeGrp 16
fragGrps[6] firstFreeGrp -1 lastFreeGrp -1
fragGrps[7] firstFreeGrp 0 lastFreeGrp 0
RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)
flags MSS_NO_SHELVE (0x4)
smallFile 5
readAhead 0
readAheadIncr 5
readAheadMax 50
autoShelveThresh 100
userId 0
shelf 0
# nvbmtpg -rv bruden_dom den_clone -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
--------------------------------------------------------------------------
CELL 19 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)
next mcell volume page cell 1 0 14
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 154
cloneId 0 cloneCnt 0 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 1 0 20
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 57712 (0xe170)
bsXA[ 1] bsPage 2 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)
blkHLimitHi,blkHLimitLo 0,0 (0)
blkSLimitHi,blkSLimitLo 0,0 (0)
fileHLimitHi,fileHLimitLo 0,0 (0)
fileSLimitHi,fileSLimitLo 0,0 (0)
blkTLimit 0, fileTLimit 0, quotaStatus 0
unused1 0, unused2 0, unused3 0, unused4 0
--------------------------------------------------------------------------
2-84 AdvFS On-Disk Structures
Solutions
--------------------------------------------------------------------------
CELL 14 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)
bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)
bfSetId.dirTag 3 (3.8003)
fragBfTag 1 (1.8001)
nextCloneSetTag 0 (0.0) origSetTag 2 (2.8001)
nxtDelPendingBfSet 0 (0.0)
state BFS_READY flags 0x0
cloneId 3 cloneCnt 0 numClones 0
fsDev 0xdc8e9f23 freeFragGrps 0 oldQuotaStatus 0
uid 0 gid 1 mode 0744 setName "den_clone"
fsContext[0], fsContext[1] 2.8001 (rootTag)
fsContext[2], fsContext[3] 3.8001 (tagsTag)
fsContext[4], fsContext[5] 4.8001 (userQuotaTag)
fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)
fragGrps[0] firstFreeGrp 32 lastFreeGrp 32
fragGrps[1] firstFreeGrp -1 lastFreeGrp -1
fragGrps[2] firstFreeGrp -1 lastFreeGrp -1
fragGrps[3] firstFreeGrp -1 lastFreeGrp -1
fragGrps[4] firstFreeGrp -1 lastFreeGrp -1
fragGrps[5] firstFreeGrp 16 lastFreeGrp 16
fragGrps[6] firstFreeGrp -1 lastFreeGrp -1
fragGrps[7] firstFreeGrp 0 lastFreeGrp 0
RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)
flags MSS_NO_SHELVE (0x4)
smallFile 5
readAhead 0
readAheadIncr 5
readAheadMax 50
autoShelveThresh 100
userId 0
shelf 0
Extent mcells from BSR_XTNTS record chain pointer.
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
--------------------------------------------------------------------------
CELL 20 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 3 (3.8003)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 264 version 0 BSR_XTRA_XTNTS (5)
blksPerPage 16
xCnt 2
bsXA[ 0] bsPage 2 vdBlk 80608 (0x13ae0)
bsXA[ 1] bsPage 8 vdBlk -1
bsXA[ 2] bsPage 0 vdBlk 0 (0x0)
AdvFS On-Disk Structures 2-85
Solutions
bsXA[ 3] bsPage 0 vdBlk 0 (0x0)
bsXA[ 4] bsPage 0 vdBlk 0 (0x0)
bsXA[ 5] bsPage 0 vdBlk 0 (0x0)
bsXA[ 6] bsPage 0 vdBlk 0 (0x0)
bsXA[ 7] bsPage 0 vdBlk 0 (0x0)
bsXA[ 8] bsPage 0 vdBlk 0 (0x0)
bsXA[ 9] bsPage 0 vdBlk 0 (0x0)
bsXA[10] bsPage 0 vdBlk 65616 (0x10050)
bsXA[11] bsPage 2 vdBlk 1 (0x1)
bsXA[12] bsPage 19 vdBlk 0 (0x0)
bsXA[13] bsPage 0 vdBlk 16 (0x10)
bsXA[14] bsPage 8 vdBlk 0 (0x0)
bsXA[15] bsPage 0 vdBlk 0 (0x0)
bsXA[16] bsPage 0 vdBlk -1
bsXA[17] bsPage 0 vdBlk 0 (0x0)
bsXA[18] bsPage 0 vdBlk 0 (0x0)
bsXA[19] bsPage 0 vdBlk 0 (0x0)
bsXA[20] bsPage 0 vdBlk 4 (0x4)
bsXA[21] bsPage 0 vdBlk 0 (0x0)
bsXA[22] bsPage 0 vdBlk 0 (0x0)
bsXA[23] bsPage 0 vdBlk 0 (0x0)
bsXA[24] bsPage 0 vdBlk 0 (0x0)
bsXA[25] bsPage 0 vdBlk 0 (0x0)
bsXA[26] bsPage 0 vdBlk 0 (0x0)
bsXA[27] bsPage 0 vdBlk 0 (0x0)
bsXA[28] bsPage 0 vdBlk 0 (0x0)
bsXA[29] bsPage 0 vdBlk 0 (0x0)
bsXA[30] bsPage 0 vdBlk 0 (0x0)
bsXA[31] bsPage 0 vdBlk 0 (0x0)
18. This exercise is difficult. Try deleting a fileset with lots of large files. While the deletion is in progress, examine the fileset mcell record to look at the progress of the deletion through the delete pending change.
Use nvbmtpg to try to catch the action.
19. Convince yourself that executing the chfsets command really does result in modification of the appropriate fileset attributes record.
# nvbmtpg -rv bruden_dom dennis_fset -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
--------------------------------------------------------------------------
CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 1 0 7
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 0 maxClonePgs 0
2-86 AdvFS On-Disk Structures
Solutions
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)
bsXA[ 1] bsPage 8 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)
blkHLimitHi,blkHLimitLo 0,30d40 (200000)
blkSLimitHi,blkSLimitLo 0,0 (0)
fileHLimitHi,fileHLimitLo 0,0 (0)
fileSLimitHi,fileSLimitLo 0,0 (0)
blkTLimit 0, fileTLimit 0, quotaStatus 1420
unused1 0, unused2 0, unused3 0, unused4 0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)
bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)
bfSetId.dirTag 2 (2.8001)
fragBfTag 1 (1.8001)
nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)
nxtDelPendingBfSet 0 (0.0)
state BFS_READY flags 0x0
cloneId 0 cloneCnt 3 numClones 1
fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0
uid 0 gid 1 mode 0744 setName "dennis_fset"
fsContext[0], fsContext[1] 2.8001 (rootTag)
fsContext[2], fsContext[3] 3.8001 (tagsTag)
fsContext[4], fsContext[5] 4.8001 (userQuotaTag)
fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)
fragGrps[0] firstFreeGrp 64 lastFreeGrp 32
fragGrps[1] firstFreeGrp -1 lastFreeGrp -1
fragGrps[2] firstFreeGrp 48 lastFreeGrp 48
fragGrps[3] firstFreeGrp -1 lastFreeGrp -1
fragGrps[4] firstFreeGrp 32 lastFreeGrp 32
fragGrps[5] firstFreeGrp 16 lastFreeGrp 16
fragGrps[6] firstFreeGrp -1 lastFreeGrp -1
fragGrps[7] firstFreeGrp 0 lastFreeGrp 0
RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)
flags MSS_NO_SHELVE (0x4)
smallFile 5
AdvFS On-Disk Structures 2-87
Solutions
readAhead 0
readAheadIncr 5
readAheadMax 50
autoShelveThresh 100
userId 0
shelf 0
#
#
# chfsets -b 200000 bruden_dom dennis_fset
chfsets: At least one fileset in this domain must be mounted.
#
# mount bruden_dom#dennis_fset /usr/dennis
#
#
# mount bruden_dom#bruce_fset /usr/bruce
#
# mount bruden_dom#den_clone /usr/den_clone
#
#
# df -t advfs
Filesystem 512-blocks Used Available Capacity Mounted on
usr_domain#usr 1426112 1025888 235376 82% /usr
usr_domain#var 1426112 134972 235376 37% /var
bruden_dom#dennis_fset 200000 92940 107060 47% /usr/dennis
bruden_dom#bruce_fset 50000 50000 0 100% /usr/bruce
bruden_dom#den_clone 2251840 92618 2079520 5% /usr/den_clone
#
#
# chfsets -b 200000 bruden_dom dennis_fset
dennis_fset
Id : 37f12c39.000263ea.2.8001
Block H Limit: 100000 --> 200000
#
# nvbmtpg -rv bruden_dom dennis_fset -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 4 nextFreePg -1 nextfreeMCId page,cell 0,24
--------------------------------------------------------------------------
CELL 6 linkSegment 0 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 1 0 7
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
2-88 AdvFS On-Disk Structures
Solutions
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 0 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 34688 (0x8780)
bsXA[ 1] bsPage 8 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BFS_QUOTA_ATTR (18)
blkHLimitHi,blkHLimitLo 0,30d40 (200000)
blkSLimitHi,blkSLimitLo 0,0 (0)
fileHLimitHi,fileHLimitLo 0,0 (0)
fileSLimitHi,fileSLimitLo 0,0 (0)
blkTLimit 0, fileTLimit 0, quotaStatus 1420
unused1 0, unused2 0, unused3 0, unused4 0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
CELL 7 linkSegment 1 bfSetTag -2 (fffffffe.0) tag 2 (2.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 220 version 0 BSR_BFS_ATTR (8)
bfSetId.domainId 37f12c39.263ea (Tue Sep 28 16:59:37 1999)
bfSetId.dirTag 2 (2.8001)
fragBfTag 1 (1.8001)
nextCloneSetTag 3 (3.8003) origSetTag 0 (0.0)
nxtDelPendingBfSet 0 (0.0)
state BFS_READY flags 0x0
cloneId 0 cloneCnt 3 numClones 1
fsDev 0xaf0db242 freeFragGrps 2 oldQuotaStatus 0
uid 0 gid 1 mode 0744 setName "dennis_fset"
fsContext[0], fsContext[1] 2.8001 (rootTag)
fsContext[2], fsContext[3] 3.8001 (tagsTag)
fsContext[4], fsContext[5] 4.8001 (userQuotaTag)
fsContext[6], fsContext[7] 5.8001 (groupQuotaTag)
fragGrps[0] firstFreeGrp 64 lastFreeGrp 32
fragGrps[1] firstFreeGrp -1 lastFreeGrp -1
fragGrps[2] firstFreeGrp 48 lastFreeGrp 48
fragGrps[3] firstFreeGrp -1 lastFreeGrp -1
fragGrps[4] firstFreeGrp 32 lastFreeGrp 32
fragGrps[5] firstFreeGrp 16 lastFreeGrp 16
fragGrps[6] firstFreeGrp -1 lastFreeGrp -1
fragGrps[7] firstFreeGrp 0 lastFreeGrp 0
RECORD 1 bCnt 36 version 0 BSR_SET_SHELVE_ATTR (17)
AdvFS On-Disk Structures 2-89
Solutions
flags MSS_NO_SHELVE (0x4)
smallFile 5
readAhead 0
readAheadIncr 5
readAheadMax 50
autoShelveThresh 100
userId 0
shelf 0
20. Examine the domain attribute and virtual disk records for your AdvFS disks.
# nvbmtpg -r bruden_dom -f
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
There is 1 page in the BMT on this volume.
The BMT uses 1 extents (out of 1) in 1 mcell.
There are 1
pages on the free list with a total of 4 free mcells.
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
There is 1 page in the BMT on this volume.
The BMT uses 1 extents (out of 1) in 1 mcell.
There are 1
pages on the free list with a total of 10 free mcells.
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 48 BMT page 0
--------------------------------------------------------------------------
There is 1 page in the BMT on this volume.
The BMT uses 1 extents (out of 1) in 1 mcell.
There are 1
pages on the free list with a total of 19 free mcells.
#
#
# nvbmtpg -rRv bruden_dom 1 0 6
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7
--------------------------------------------------------------------------
CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -6 (fffffffa.0)(RBMT)
next mcell volume page cell 1 0 27
RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)
vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)
state 1
vdIndex 1
jays_new_field 0
vdBlkCnt 131072
2-90 AdvFS On-Disk Structures
Solutions
stgCluster 16
maxPgSz 16
bmtXtntPgs 128
serviceClass 1
RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)
bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)
maxVds 256
bfSetDirTag -2 (fffffffe.0)
RECORD 2 bCnt 52 version 0 BSR_DMN_MATTR (15)
seqNum 1
delPendingBfSet 0 (0.0)
uid 0
gid 1
mode 0744
vdCnt 3
recoveryFailed 0
bfSetDirTag -8
ftxLogTag -9
ftxLogPgs 512
RECORD 3 bCnt 20 version 0 BSR_DMN_TRANS_ATTR (21)
chainVdIndex -1
chainMCId page,cell 0,0
op 0
dev 0x0
#
# nvbmtpg -rRv bruden_dom 2 0 6
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 32 RBMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7
--------------------------------------------------------------------------
CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -12 (fffffff4.0)(RBMT)
next mcell volume page cell 2 0 27
RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)
vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)
state 1
vdIndex 2
jays_new_field 0
vdBlkCnt 262144
stgCluster 16
maxPgSz 16
bmtXtntPgs 128
serviceClass 1
RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)
AdvFS On-Disk Structures 2-91
Solutions
bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)
maxVds 256
bfSetDirTag -2 (fffffffe.0)
#
# nvbmtpg -rRv bruden_dom 3 0 6
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 32 RBMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 20 nextFreePg 0 nextfreeMCId page,cell 0,7
--------------------------------------------------------------------------
CELL 6 linkSegment 1 bfSetTag -2 (fffffffe.0) tag -18 (ffffffee.0)(RBMT)
next mcell volume page cell 3 0 27
RECORD 0 bCnt 40 version 0 BSR_VD_ATTR (3)
vdMntId 37f2795f.000c98fb (Wed Sep 29 16:41:03 1999)
state 1
vdIndex 3
jays_new_field 0
vdBlkCnt 1858632
stgCluster 16
maxPgSz 16
bmtXtntPgs 128
serviceClass 1
RECORD 1 bCnt 24 version 0 BSR_DMN_ATTR (4)
bfDomainId 37f12c39.000263ea (Tue Sep 28 16:59:37 1999)
maxVds 256
bfSetDirTag -2 (fffffffe.0)
21. Examine the BMT POSIX file stat records for a few of your files. See how the information stored in this record is reflected in the output of ls -l.
# nvbmtpg -rv bruden_dom dennis_fset -t 10
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 10 nextFreePg -1 nextfreeMCId page,cell 0,18
--------------------------------------------------------------------------
CELL 10 linkSegment 0 bfSetTag 2 (2.8001) tag 10 (a.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 940
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
2-92 AdvFS On-Disk Structures
Solutions
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 2 0 15
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 10 st_mode 100644 (S_IFREG) st_nlink 1 st_size 124456
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 11:13:22 1999 st_umtime 538485000
st_atime Wed Sep 29 11:09:47 1999 st_uatime 126376000
st_ctime Wed Sep 29 11:13:22 1999 st_uctime 538485000
fragId.frag 386 fragId.type 2 BF_FRAG_2K fragPageOffset 15
dir_tag 2 (2.8001) st_flags 0 st_unused_1 983040 st_unused_2 0
#
# pwd
/usr/bruden/advfs
#
# cd /usr/dennis
#
# ls -li sm1
10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1
#
#
#
# (0644 is rw-r--r--)
22. Now create some symbolic links and examine the corresponding BMT fast symbolic link records.
# pwd
/usr/dennis
#
# ln -s /etc/passwd pw
#
# ls -li pw
16 lrwxrwxrwx 1 root system 11 Sep 29 17:24 pw -> /etc/passwd
#
# nvbmtpg -rv bruden_dom dennis_fset -t 16 -c
AdvFS On-Disk Structures 2-93
Solutions
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26
--------------------------------------------------------------------------
CELL 24 linkSegment 0 bfSetTag 2 (2.8001) tag 16 (10.8001)
next mcell volume page cell 1 0 25
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 113
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 1
bsXA[ 0] bsPage 0 vdBlk -1
bsXA[ 1] bsPage 0 vdBlk 0 (0x0)
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 16 st_mode 120777 (S_IFLNK) st_nlink 1 st_size 11
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 17:24:25 1999 st_umtime 167392000
st_atime Wed Sep 29 17:24:31 1999 st_uatime 363681000
st_ctime Wed Sep 29 17:24:25 1999 st_uctime 167392000
fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0
dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0
--------------------------------------------------------------------------
--------------------------------------------------------------------------
CELL 25 linkSegment 1 bfSetTag 2 (2.8001) tag 16 (10.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 15 version 0 BMTR_FS_DATA (254)
/etc/passwd
23. Use mktrashcan to create a trashcan directory and then examine the BMT record that points to the trash.
# shtrashcan /usr/dennis
’/usr/dennis/den_trash’ attached to ’/usr/dennis’
#
#
2-94 AdvFS On-Disk Structures
Solutions
# ls -lid /usr/dennis
2 drwxrwxrwx 5 root system 8192 Sep 29 17:24 /usr/dennis
#
# nvbmtpg -rv bruden_dom dennis_fset -t 2 -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26
--------------------------------------------------------------------------
CELL 8 linkSegment 0 bfSetTag 2 (2.8001) tag 2 (2.8001)
next mcell volume page cell 1 0 9
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 34816 (0x8800)
bsXA[ 1] bsPage 1 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)
dataSafety BFD_NIL
reqServices 1
optServices 0
extendSize 0
clientArea 0 0 0 0
rsvd1 0
rsvd2 0
rsvd_sec1 0
rsvd_sec2 0
rsvd_sec3 0
RECORD 3 bCnt 8 version 0 BMTR_FS_TIME (251)
last sync Wed Sep 29 16:41:04 1999
RECORD 4 bCnt 12 version 0 BMTR_FS_UNDEL_DIR (252)
dir_tag 11 (b.8001)
--------------------------------------------------------------------------
AdvFS On-Disk Structures 2-95
Solutions
--------------------------------------------------------------------------
CELL 9 linkSegment 1 bfSetTag 2 (2.8001) tag 2 (2.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 2 st_mode 40777 (S_IFDIR) st_nlink 5 st_size 8192
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 17:24:25 1999 st_umtime 167392000
st_atime Wed Sep 29 16:00:36 1999 st_uatime 968173000
st_ctime Wed Sep 29 17:24:25 1999 st_uctime 167392000
fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0
dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0
24. Find the mcell ID of a file system root and examine its BMT file system time records.
# ls -lid /usr/bruce
2 drwxrwxrwx 3 root system 8192 Sep 28 17:24 /usr/bruce
#
# nvbmtpg -rv bruden_dom bruce_fset -t 2 -c
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 48 BMT page 0
--------------------------------------------------------------------------
pageId 0 megaVersion 4
freeMcellCnt 2 nextFreePg -1 nextfreeMCId page,cell 0,26
--------------------------------------------------------------------------
CELL 3 linkSegment 0 bfSetTag 1 (1.8001) tag 2 (2.8001)
next mcell volume page cell 1 0 4
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 2
cloneId 0 cloneCnt 0 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_FTX_AGENT (2)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize 0
delLink next page,cell 0,0 prev page,cell 0,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 34656 (0x8760)
bsXA[ 1] bsPage 1 vdBlk -1
RECORD 2 bCnt 64 version 0 BSR_BF_INHERIT_ATTR (16)
dataSafety BFD_NIL
reqServices 1
optServices 0
2-96 AdvFS On-Disk Structures
Solutions
extendSize 0
clientArea 0 0 0 0
rsvd1 0
rsvd2 0
rsvd_sec1 0
rsvd_sec2 0
rsvd_sec3 0
RECORD 3 bCnt 8 version 0 BMTR_FS_TIME (251)
last sync Wed Sep 29 16:41:26 1999
--------------------------------------------------------------------------
--------------------------------------------------------------------------
CELL 4 linkSegment 1 bfSetTag 1 (1.8001) tag 2 (2.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 2 st_mode 40777 (S_IFDIR) st_nlink 3 st_size 8192
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Tue Sep 28 17:24:53 1999 st_umtime 191806000
st_atime Tue Sep 28 17:04:01 1999 st_uatime 0
st_ctime Tue Sep 28 17:24:53 1999 st_uctime 191806000
fragId.frag 0 fragId.type 0 BF_FRAG_ANY fragPageOffset 0
dir_tag 2 (2.8001) st_flags 0 st_unused_1 0 st_unused_2 0
25. Start by running showfsets on your AdvFS file domains so that you will know a few fileset IDs to use in the remaining exercises.
# showfsets bruden_dom
bruce_fset
Id : 37f12c39.000263ea.1.8001
Files : 6, SLim= 0, HLim= 0
Blocks (512) : 68288, SLim= 50000, HLim= 200000 grc= none
Quota Status : user=off group=off
dennis_fset
Id : 37f12c39.000263ea.2.8001
Clone is : den_clone
Files : 13, SLim= 0, HLim= 0
Blocks (512) : 92940, SLim= 0, HLim= 400000
Quota Status : user=off group=off
den_clone
Id : 37f12c39.000263ea.3.8003
Clone of : dennis_fset
Revision : 3
AdvFS On-Disk Structures 2-97
Solutions
26. Use the nvtagpg program, located in /sbin/advfs, to list the root tag files of an AdvFS file domain.
# nvtagpg -r bruden_dom
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 96 root TAG page 0
--------------------------------------------------------------------------
currPage 0
numAllocTMaps 3 numDeadTMaps 0 nextFreePage 0 nextFreeMap 5
tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 1 0 1 bruce_fset
tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 6 dennis_fset
tMapA[3] tag 3 seqNo 3 primary mcell (vol,page,cell) 1 0 19 den_clone
27. Select a target file and use both showfile and ls -i to obtain its tag number. The reason for using two programs is that one prints the tag number in decimal and other prints the sequence number.
Divide the tag number by 1022 and write down both the quotient and remainder. The quotient determines the page number containing the appropriate tagmap entry while the remainder determines the position within the page.
If the sequence numbers of nvtagpg and showfile don’t match, you don’t have the right tagmap entry. You may need to convert from hexadecimal to decimal to verify the match.
# showfile -x sm1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
a.8001 2 16 15 simple ** ** async 20% sm1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 3 2 121328 48
3 1 2 75488 16
4 3 2 98256 48
7 4 2 75360 64
11 4 2 98480 64
extentCnt: 5
#
# ls -li sm1
10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1
#
# nvtagpg -r bruden_dom -T 2 -t 10
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 34688 "dennis_fset" TAG page 0
--------------------------------------------------------------------------
currPage 0
numAllocTMaps 16 numDeadTMaps 0 nextFreePage 0 nextFreeMap 18
2-98 AdvFS On-Disk Structures
Solutions
tMapA[10] tag 10 seqNo 1 primary mcell (vol,page,cell) 2 0 10
#
#
# nvbmtpg -r bruden_dom 2 0 10
==========================================================================
DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 48 BMT page 0
--------------------------------------------------------------------------
CELL 10 next mcell volume page cell 0 0 0 bfSetTag,tag 2,10
RECORD 0 bCnt 92 BSR_ATTR
type BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTS
type BSXMT_APPEND chain mcell volume page cell 2 0 15
firstXtnt mcellCnt 2 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 121328 (0x1d9f0)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 BMTR_FS_STAT
st_mode 100644 (S_IFREG) st_uid 0 st_gid 0 st_size 124456
st_nlink 1 dir_tag 2 st_mtime Wed Sep 29 11:13:22 1999
fragId.type BF_FRAG_2K fragId.frag 386
#
# ls -li sm1
10 -rw-r--r-- 1 root system 124456 Sep 29 11:13 sm1
28. Use the showfile -x command on the .tags M-file for a fileset to determine the extent map of a fileset’s tag file.
# showfile -x /usr/dennis/.tags/M1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
1.8001 1 16 8 simple ** ** ftx 100% M1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 8 1 34528 128
extentCnt: 1
#
#
# showfile -x /usr/dennis/.tags/M2
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
2.8001 1 16 8 simple ** ** ftx 100% M2
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 8 1 34688 128
AdvFS On-Disk Structures 2-99
Solutions
extentCnt: 1
#
# showfile -x /usr/dennis/.tags/M3
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
3.8003 1 16 8 simple ** ** ftx 50% M3
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 2 1 57712 32
2 6 1 80608 96
extentCnt: 2
#
# showfile -x /usr/dennis/.tags/M4
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
showfile: lstat failed for file ’/usr/dennis/.tags/M4’; No such file or directory
29. Create a directory and connect into it. Now look at the directory file by typing the following commands:
# vfilepg -r domain_name fileset_name directory/spec -f d
# od -A x -a -h -H . | more
Notice the entries for . and .. along with all the empty directory entries.
# ls -l | grep ^d
drwx------ 2 root system 8192 Sep 28 17:04 .tags
drwxrwxrwx 2 root system 8192 Sep 29 09:30 den_trash
drwxr-xr-x 2 root system 8192 Sep 29 11:09 testdir
#
# cd testdir
# vfilepg -r bruden_dom dennis_fset testdir -f d
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0
--------------------------------------------------------------------------
tag name
14 .
2 ..
15 smsub
#
#
# od -A x -a -h -H .
0000000 so nul nul nul dc4 nul soh nul . nul nul nul so nul nul nul
000e 0000 0014 0001 002e 0000 000e 0000
0000000e 00010014 0000002e 0000000e
0000010 soh nul nul nul stx nul nul nul dc4 nul stx nul . . nul nul
2-100 AdvFS On-Disk Structures
Solutions
8001 0000 0002 0000 0014 0002 2e2e 0000
00008001 00000002 00020014 00002e2e
(…)
#
#
# pwd
/usr/dennis/testdir
30. Use the touch command to create five files, i, ii, iii, iv, and v within your new directory. Use od and vfilepg to examine the directory file.
# touch i
# touch ii
# touch iii
# touch iv
# touch v
#
#
# vfilepg -r bruden_dom dennis_fset testdir -f d
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0
--------------------------------------------------------------------------
tag name
14 .
2 ..
15 smsub
17 i
18 ii
19 iii
20 iv
21 v
#
# od -A x -a -h -H .
0000000 so nul nul nul dc4 nul soh nul . nul nul nul so nul nul nul
000e 0000 0014 0001 002e 0000 000e 0000
0000000e 00010014 0000002e 0000000e
0000010 soh nul nul nul stx nul nul nul dc4 nul stx nul . . nul nul
8001 0000 0002 0000 0014 0002 2e2e 0000
00008001 00000002 00020014 00002e2e
0000020 stx nul nul nul soh nul nul nul si nul nul nul can nul enq nul
0002 0000 8001 0000 000f 0000 0018 0005
00000002 00008001 0000000f 00050018
0000030 s m s u b nul nul nul si nul nul nul soh nul nul nul
6d73 7573 0062 0000 000f 0000 8001 0000
75736d73 00000062 0000000f 00008001
0000040 dc1 nul nul nul dc4 nul soh nul i nul nul nul dc1 nul nul nul
0011 0000 0014 0001 0069 0000 0011 0000
00000011 00010014 00000069 00000011
0000050 soh nul nul nul dc2 nul nul nul dc4 nul stx nul i i nul nul
8001 0000 0012 0000 0014 0002 6969 0000
00008001 00000012 00020014 00006969
AdvFS On-Disk Structures 2-101
Solutions
0000060 dc2 nul nul nul soh nul nul nul dc3 nul nul nul dc4 nul etx nul
0012 0000 8001 0000 0013 0000 0014 0003
00000012 00008001 00000013 00030014
0000070 i i i nul dc3 nul nul nul soh nul nul nul dc4 nul nul nul
6969 0069 0013 0000 8001 0000 0014 0000
00696969 00000013 00008001 00000014
0000080 dc4 nul stx nul i v nul nul dc4 nul nul nul soh nul nul nul
0014 0002 7669 0000 0014 0000 8001 0000
00020014 00007669 00000014 00008001
0000090 nak nul nul nul dc4 nul soh nul v nul nul nul nak nul nul nul
0015 0000 0014 0001 0076 0000 0015 0000
00000015 00010014 00000076 00000015
(…)
#
#
# vfilepg -r bruden_dom dennis_fset -t 14
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0
--------------------------------------------------------------------------
000000 0e 00 00 00 14 00 01 00 2e 00 00 00 0e 00 00 00 ................
000010 01 80 00 00 02 00 00 00 14 00 02 00 2e 2e 00 00 ................
000020 02 00 00 00 01 80 00 00 0f 00 00 00 18 00 05 00 ................
000030 73 6d 73 75 62 00 00 00 0f 00 00 00 01 80 00 00 smsub...........
000040 11 00 00 00 14 00 01 00 69 00 00 00 11 00 00 00 ........i.......
000050 01 80 00 00 12 00 00 00 14 00 02 00 69 69 00 00 ............ii..
000060 12 00 00 00 01 80 00 00 13 00 00 00 14 00 03 00 ................
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
000080 14 00 02 00 69 76 00 00 14 00 00 00 01 80 00 00 ....iv..........
000090 15 00 00 00 14 00 01 00 76 00 00 00 15 00 00 00 ........v.......
(…)
#
#
#
# vfilepg -r bruden_dom dennis_fset -t 14 | grep iii
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
#
#
# vfilepg -r bruden_dom dennis_fset -t 14 | grep 070
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
000700 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
001070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
#
# vfilepg -r bruden_dom dennis_fset -t 14 | grep 000070
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
2-102 AdvFS On-Disk Structures
Solutions
31. Remove the file iii. Use od to determine what happens to the directory entry for iii.
# rm iii
#
# vfilepg -r bruden_dom dennis_fset -t 14 | grep 000070
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
#
32. Now remove ii. Notice how the old directory entries for ii and iii have been merged.
See solution for #31.
33. You may have noticed that the tags for ii and iii continue to reside in the directory file and may be wondering about the possibilities for file undeletion. Remember what happens to free tagmap entries. They are placed on a free list for recycling.
34. Create, via touch, a file vii and notice where its directory entry is placed.
# touch vii
#
# vfilepg -r bruden_dom dennis_fset -t 14
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0
--------------------------------------------------------------------------
000000 0e 00 00 00 14 00 01 00 2e 00 00 00 0e 00 00 00 ................
000010 01 80 00 00 02 00 00 00 14 00 02 00 2e 2e 00 00 ................
000020 02 00 00 00 01 80 00 00 0f 00 00 00 18 00 05 00 ................
000030 73 6d 73 75 62 00 00 00 0f 00 00 00 01 80 00 00 smsub...........
000040 11 00 00 00 14 00 01 00 69 00 00 00 11 00 00 00 ........i.......
000050 01 80 00 00 12 00 00 00 14 00 03 00 76 69 69 00 ............vii.
000060 12 00 00 00 02 80 00 00 00 00 00 00 14 00 00 00 ................
000070 69 69 69 00 13 00 00 00 01 80 00 00 14 00 00 00 iii.............
000080 14 00 02 00 69 76 00 00 14 00 00 00 01 80 00 00 ....iv..........
000090 15 00 00 00 14 00 01 00 76 00 00 00 15 00 00 00 ........v.......
0000a0 01 80 00 00 00 00 00 00 5c 01 00 00 00 00 00 00 ........\.......
(…)
#
#
# vfilepg -r bruden_dom dennis_fset testdir -f d
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 92080 page 0
--------------------------------------------------------------------------
tag name
14 .
2 ..
15 smsub
17 i
18 vii
AdvFS On-Disk Structures 2-103
Solutions
20 iv
21 v
35. Create several files with very, very long names and see how the creation of directory entries avoids crossing the sector boundaries.
#
# touch Not_so_long_but_still_pretty_longggggggggggggggggggggggggggggggggg
nggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg
ggggggggggggggggggggggggggggggggggggggg
#
# ls -li
total 92
19 -rw-r--r-- 1 root system 0 Sep 29 18:30
Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg
17 -rw-r--r-- 1 root system 0 Sep 29 17:57 i
20 -rw-r--r-- 1 root system 0 Sep 29 17:57 iv
15 -rw-r--r-- 1 root system 93342 Sep 29 11:09 smsub
21 -rw-r--r-- 1 root system 0 Sep 29 17:57 v
18 -rw-r--r-- 1 root system 0 Sep 29 18:25 vii
# ls -lid
14 drwxr-xr-x 2 root system 8192 Sep 29 18:30 .
#
#
#
# ls -li
total 92
19 -rw-r--r-- 1 root system 0 Sep 29 18:31
Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg
22 -rw-r--r-- 1 root system 0 Sep 29 18:31
Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg1
23 -rw-r--r-- 1 root system 0 Sep 29 18:31
Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg2
24 -rw-r--r-- 1 root system 0 Sep 29 18:31
Not_so_long_but_still_pretty_longgggggggggggggggggggggggggggggggggggggggggggggggggggggg
gggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg3
17 -rw-r--r-- 1 root system 0 Sep 29 17:57 i
20 -rw-r--r-- 1 root system 0 Sep 29 17:57 iv
15 -rw-r--r-- 1 root system 93342 Sep 29 11:09 smsub
21 -rw-r--r-- 1 root system 0 Sep 29 17:57 v
18 -rw-r--r-- 1 root system 0 Sep 29 18:25 vii
#
# ls -lid
14 drwxr-xr-x 2 root system 8192 Sep 29 18:31 .
2-104 AdvFS On-Disk Structures
Solutions
36. Perform a showfile -i command on an 8K directory file. Make a larger directory file (use the script shown here if you like). What does showfile -i indicate on the larger directory?
# showfile -i /usr/dennis/testdir
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
/usr/dennis/testdir: does not have an index file
#
#
# cat make_100
#! /usr/bin/ksh
integer x=0
integer end=0
read x?"Start number? "
end=$x+100
while (( $x < $end ))
do
touch file$x
x=$x+1
done
#
# ls -lid
14 drwxr-xr-x 2 root system 8192 Sep 29 18:35 .
#
#
# ./make_100
Start number? 0
#
# ls -lid
14 drwxr-xr-x 2 root system 8192 Sep 29 18:46 .
#
# ./make_100
Start number? 100
#
# ls -lid
14 drwxr-xr-x 2 root system 8192 Sep 29 18:46 .
#
# ./make_100
Start number? 200
#
# ls -lid
AdvFS On-Disk Structures 2-105
Solutions
14 drwxr-xr-x 2 root system 16384 Sep 29 18:46 .
#
#
# showfile -i .
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
146.8001 1 16 2 simple ** ** ftx 100% index (.)
#
#
#
# pwd
/usr/dennis/testdir
37. To convince yourself that something unusual really is happening, execute the following commands:
$ dd if=/etc/disktab of=frag.file$ ls -l frag.file$ showfile -x frag.file
# dd if=/etc/disktab of=frag.file
60+1 records in
60+1 records out
#
#
# ls -li frag.file
327 -rw-r--r-- 1 root system 31114 Sep 29 18:48 frag.file
#
#
# showfile -x frag.file
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
147.8001 2 16 3 simple ** ** async 50% frag.file
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 2 2 75424 32
2 1 2 98464 16
extentCnt: 2
#
38. Execute the ls -l command on some fragment bitfiles. Remember that .tags/1 gets you to a fileset’s fragment bitfile.
# ls -li /usr/dennis/.tags/1
1 ---------- 0 root system 786432 Dec 31 1969 /usr/dennis/.tags/1
#
#
2-106 AdvFS On-Disk Structures
Solutions
# ls -li /usr/bruce/.tags/1
1 ---------- 0 root system 0 Dec 31 1969 /usr/bruce/.tags/1
#
39. Copy some randomly sized file, such as /etc/disktab, onto your AdvFS file system. Now use nvbmtpg to find out where your file’s fragment resides within the fragment bitfile.
# cp /etc/disktab /usr/dennis
#
# ls -li di*
328 -rwxr-xr-x 1 root system 31114 Sep 29 18:52 disktab
#
#
# nvbmtpg -rv bruden_dom dennis_fset -t 328
==========================================================================
DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 176 BMT page 4
--------------------------------------------------------------------------
pageId 4 megaVersion 4
freeMcellCnt 27 nextFreePg 5 nextfreeMCId page,cell 4,1
--------------------------------------------------------------------------
CELL 0 linkSegment 0 bfSetTag 2 (2.8001) tag 328 (148.8001)
next mcell volume page cell 0 0 0
RECORD 0 bCnt 92 version 0 BSR_ATTR (2)
type BSRA_VALID (3)
bfPgSz 16 transitionId 852
cloneId 0 cloneCnt 3 maxClonePgs 0
deleteWithClone 0 outOfSyncClone 0
cl.dataSafety BFD_NIL (0)
cl reqServices 1 optServices 0 extendSize 0 rsvd1 0
rsvd2 0 acl 0 rsvd_sec1 0 rsvd_sec2 0 rsvd_sec3 0
RECORD 1 bCnt 80 version 0 BSR_XTNTS (1)
type BSXMT_APPEND (0)
chain mcell volume page cell 0 0 0
blksPerPage 16 segmentSize -1484947200
delLink next page,cell 56286197,22 prev page,cell 20828416,0
delRst volume,page,cell 0,0,0 xtntIndex 0 offset 0 blocks 0
firstXtnt mcellCnt 1 xCnt 2
bsXA[ 0] bsPage 0 vdBlk 1105264 (0x10dd70)
bsXA[ 1] bsPage 3 vdBlk -1
RECORD 2 bCnt 92 version 0 BMTR_FS_STAT (255)
st_ino 328 st_mode 100755 (S_IFREG) st_nlink 1 st_size 31114
st_uid 0 st_gid 0 st_rdev 0 major 0 minor 0
st_mtime Wed Sep 29 18:52:20 1999 st_umtime 360751000
st_atime Wed Sep 29 18:52:20 1999 st_uatime 355868000
st_ctime Wed Sep 29 18:52:20 1999 st_uctime 360751000
fragId.frag 8 fragId.type 7 BF_FRAG_7K fragPageOffset 3
dir_tag 2 (2.8001) st_flags 0 st_unused_1 196608 st_unused_2 0
AdvFS On-Disk Structures 2-107
Solutions
40. Now use dd to copy that fragment directly out of the fragment bitfile. You will use a command similar to:
# dd if=/playpen/.tags/1 of=/tmp/copy ibs=1024 iseek=76275 count=3
# dd if=/usr/dennis/.tags/1 of=/tmp/copy ibs=1024 iseek=8 count=1
1+0 records in
2+0 records out
#
# cat /tmp/copy
:
ra71|RA71|DEC RA71 Winchester:\
:ty=winchester:dt=MSCP:ns#51:nt#14:nc#1915:\
:oa#0:pa#131072:ba#8192:fa#1024:\
:ob#131072:pb#262144:bb#8192:fb#1024:\
:oc#0:pc#1367310:bc#8192:fc#1024:\
:od#393216:pd#324698:bd#8192:fd#1024:\
:oe#717914:pe#324698:be#8192:fe#1024:\
:of#1042612:pf#324698:bf#8192:ff#1024:\
:og#393216:pg#819200:bg#8192:fg#1024:\
:oh#1212416:ph#154894:bh#8192:fh#1024:
(…)
#
41. The program nvfragpg, found in /sbin/advfs, prints various interesting statistics about fragment usage within the eight different fragment groups. Read the reference page for this command and then apply it to each of your AdvFS filesets.
# nvfragpg -rv bruden_dom dennis_fset
==========================================================================
DOMAIN "bruden_dom"
--------------------------------------------------------------------------
frag type free 1K 2K 3K 4K 5K 6K 7K totals
groups 1 1 1 0 1 1 0 1 6
frags - 127 63 0 31 25 0 18 264
frags used - 1 1 0 1 0 0 2 5
disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K
space used - 1K 2K 0K 4K 0K 0K 14K 21K
space free 127K 126K 124K 0K 120K 125K 0K 112K 734K
overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K
wasted - 0K 1K 0K 3K 2K 0K 1K 7K
% used - <1% 1% 0% 3% <1% 0% 9% 2%
PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1
PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1
PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1
PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1
PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1
PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1
2-108 AdvFS On-Disk Structures
Solutions
#
#
# nvfragpg -rvf bruden_dom dennis_fset
==========================================================================
DOMAIN "bruden_dom"
--------------------------------------------------------------------------
frag type free 1K 2K 3K 4K 5K 6K 7K totals
groups 1 1 1 0 1 1 0 1 6
frags - 127 63 0 31 25 0 18 264
frags used - 1 1 0 1 0 0 2 5
disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K
space used - 1K 2K 0K 4K 0K 0K 14K 21K
space free 127K 126K 124K 0K 120K 125K 0K 112K 734K
overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K
wasted - 0K 1K 0K 3K 2K 0K 1K 7K
% used - <1% 1% 0% 3% <1% 0% 9% 2%
head of free lists of frag groups from fileset attributes:
frag type BF_FRAG_ANY firstFreeGrp 80 lastFreeGrp 32
frag type BF_FRAG_1K firstFreeGrp 64 lastFreeGrp 64
frag type BF_FRAG_2K firstFreeGrp 48 lastFreeGrp 48
frag type BF_FRAG_3K firstFreeGrp -1 lastFreeGrp -1
frag type BF_FRAG_4K firstFreeGrp 32 lastFreeGrp 32
frag type BF_FRAG_5K firstFreeGrp 16 lastFreeGrp 16
frag type BF_FRAG_6K firstFreeGrp -1 lastFreeGrp -1
frag type BF_FRAG_7K firstFreeGrp 0 lastFreeGrp 0
BF_FRAG_ANY groups on the free list
PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1
BF_FRAG_1K groups on the free list
PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1
BF_FRAG_2K groups on the free list
PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1
BF_FRAG_4K groups on the free list
PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1
BF_FRAG_5K groups on the free list
PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1
BF_FRAG_7K groups on the free list
PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1
42. Use the nvtagpg command to find the mcell IDs of some AdvFS bitfile-sets. Now use nvfragpg to print out the addresses of the fragment group headers for these filesets.
# nvtagpg -rv bruden_dom dennis_fset
==========================================================================
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 34688 "dennis_fset" TAG page 0
--------------------------------------------------------------------------
currPage 0
numAllocTMaps 328 numDeadTMaps 0 nextFreePage 0 nextFreeMap 330 padding 0
tMapA[0] tag 0 seqNo 1 (NOT USED) primary (vol,page,cell) 0 0 1
AdvFS On-Disk Structures 2-109
Solutions
tMapA[1] tag 1 seqNo 1 primary mcell (vol,page,cell) 2 0 4
tMapA[2] tag 2 seqNo 1 primary mcell (vol,page,cell) 1 0 8
tMapA[3] tag 3 seqNo 1 primary mcell (vol,page,cell) 2 0 5
tMapA[4] tag 4 seqNo 1 primary mcell (vol,page,cell) 1 0 10
tMapA[5] tag 5 seqNo 1 primary mcell (vol,page,cell) 2 0 6
tMapA[6] tag 6 seqNo 1 primary mcell (vol,page,cell) 2 0 7
tMapA[7] tag 7 seqNo 1 primary mcell (vol,page,cell) 1 0 12
tMapA[8] tag 8 seqNo 1 primary mcell (vol,page,cell) 2 0 8
tMapA[9] tag 9 seqNo 2 primary mcell (vol,page,cell) 1 0 17
tMapA[10] tag 10 seqNo 1 primary mcell (vol,page,cell) 2 0 10
tMapA[11] tag 11 seqNo 1 primary mcell (vol,page,cell) 3 0 6
tMapA[12] tag 12 seqNo 1 primary mcell (vol,page,cell) 2 0 12
tMapA[13] tag 13 seqNo 1 primary mcell (vol,page,cell) 1 0 15
tMapA[14] tag 14 seqNo 1 primary mcell (vol,page,cell) 1 0 22
tMapA[15] tag 15 seqNo 1 primary mcell (vol,page,cell) 2 0 13
tMapA[16] tag 16 seqNo 1 primary mcell (vol,page,cell) 1 0 24
(…)
tMapA[328] tag 328 seqNo 1 primary mcell (vol,page,cell) 3 4 0
tMapA[329] tag 329 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 11
tMapA[330] tag 330 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 12
tMapA[331] tag 331 seqNo 1 (NOT USED) primary (vol,page,cell) 0 10 13
(…)
tMapA[1021] tag 1021 seqNo 1 (NOT USED) primary (vol,page,cell) 0 0 0
#
#
# nvfragpg -rv bruden_dom dennis_fset
==========================================================================
DOMAIN "bruden_dom"
--------------------------------------------------------------------------
frag type free 1K 2K 3K 4K 5K 6K 7K totals
groups 1 1 1 0 1 1 0 1 6
frags - 127 63 0 31 25 0 18 264
frags used - 1 1 0 1 0 0 2 5
disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K
space used - 1K 2K 0K 4K 0K 0K 14K 21K
space free 127K 126K 124K 0K 120K 125K 0K 112K 734K
overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K
wasted - 0K 1K 0K 3K 2K 0K 1K 7K
% used - <1% 1% 0% 3% <1% 0% 9% 2%
PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1
PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1
PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1
PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1
PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1
PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1
2-110 AdvFS On-Disk Structures
Solutions
43. The nvfragpg program, also located in /sbin/advfs, will print out a list of the free fragments found within a fragment group along with the address of the next group of that type.
# nvfragpg -rvf bruden_dom dennis_fset
==========================================================================
DOMAIN "bruden_dom"
--------------------------------------------------------------------------
frag type free 1K 2K 3K 4K 5K 6K 7K totals
groups 1 1 1 0 1 1 0 1 6
frags - 127 63 0 31 25 0 18 264
frags used - 1 1 0 1 0 0 2 5
disk space 128K 128K 128K 0K 128K 128K 0K 128K 768K
space used - 1K 2K 0K 4K 0K 0K 14K 21K
space free 127K 126K 124K 0K 120K 125K 0K 112K 734K
overhead 1K 1K 1K 0K 1K 1K 0K 1K 6K
wasted - 0K 1K 0K 3K 2K 0K 1K 7K
% used - <1% 1% 0% 3% <1% 0% 9% 2%
head of free lists of frag groups from fileset attributes:
frag type BF_FRAG_ANY firstFreeGrp 80 lastFreeGrp 32
frag type BF_FRAG_1K firstFreeGrp 64 lastFreeGrp 64
frag type BF_FRAG_2K firstFreeGrp 48 lastFreeGrp 48
frag type BF_FRAG_3K firstFreeGrp -1 lastFreeGrp -1
frag type BF_FRAG_4K firstFreeGrp 32 lastFreeGrp 32
frag type BF_FRAG_5K firstFreeGrp 16 lastFreeGrp 16
frag type BF_FRAG_6K firstFreeGrp -1 lastFreeGrp -1
frag type BF_FRAG_7K firstFreeGrp 0 lastFreeGrp 0
BF_FRAG_ANY groups on the free list
PAGE 80 lbn 99072 BF_FRAG_ANY version 1 freeFrags 0 nextFreeGrp -1
BF_FRAG_1K groups on the free list
PAGE 64 lbn 98816 BF_FRAG_1K version 1 freeFrags 126 nextFreeGrp -1
BF_FRAG_2K groups on the free list
PAGE 48 lbn 98560 BF_FRAG_2K version 1 freeFrags 62 nextFreeGrp -1
BF_FRAG_4K groups on the free list
PAGE 32 lbn 121904 BF_FRAG_4K version 1 freeFrags 30 nextFreeGrp -1
BF_FRAG_5K groups on the free list
PAGE 16 lbn 121648 BF_FRAG_5K version 1 freeFrags 25 nextFreeGrp -1
BF_FRAG_7K groups on the free list
PAGE 0 lbn 121392 BF_FRAG_7K version 1 freeFrags 16 nextFreeGrp -1
44. Start out by running od -x on one of your storage bitmap files. The command syntax will be something like:
# od -x -N 1024 /usr/.tags/-7
# od -x -N 1024 /usr/dennis/.tags/M-7
0000000 0000 0000 c700 ffff ffff ffff ffff ffff
0000020 ffff ffff ffff ffff ffff ffff ffff ffff
*
AdvFS On-Disk Structures 2-111
Solutions
0000120 ffff ffff ffff ffff 00ff 0000 0000 0000
0000140 0000 0000 0000 0000 0000 0000 0000 0000
*
0000420 0000 0000 c000 ffff ffff ffff ffff ffff
0000440 ffff ffff ffff ffff ffff ffff ffff ffff
*
0001340 07ff 0000 0000 0000 0000 0000 0000 0000
0001360 0000 0000 0000 0000 0000 0000 0000 0000
*
0020000
#
45. Repeat the previous exercise, but this time use the virtual disk interface. You must use showfile -x to find the extent map for the storage bitmap. It will look similar to:
# od -x -j 112b -N 1024 /dev/disk/dsk3c
# od -x -j 112b -N 1024 /dev/rdisk/dsk0a
0000000 0000 0000 c700 ffff ffff ffff ffff ffff
0000016 ffff ffff ffff ffff ffff ffff ffff ffff
*
0000080 ffff ffff ffff ffff 00ff 0000 0000 0000
0000096 0000 0000 0000 0000 0000 0000 0000 0000
*
0000272 0000 0000 c000 ffff ffff ffff ffff ffff
0000288 ffff ffff ffff ffff ffff ffff ffff ffff
*
0000736 07ff 0000 0000 0000 0000 0000 0000 0000
0000752 0000 0000 0000 0000 0000 0000 0000 0000
*
0008192 0002 0000 0003 0000 2c39 37f1 63ea 0002
0008208 0001 0000 0000 ffff 0000 0000 0002 0000
(...)
46. Here’s how to determine if page 17000 of an AdvFS virtual disk is free:
# expr 17000 / 8184 \* 8192 + 17000 % 8184 + 8
17024
# od -x -j 17024 -N 1 /usr/.tags/-7
47. Tru64 UNIX V5 supplies a much more convenient command, vsbmpg. Look at the reference page for this command. Accomplish the same result as Exercise 43 without all the arithmetic. Find out if page 50000 of one of your virtual disks is free.
# vsbmpg -r bruden_dom 1 -B 50000
ο
DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 112 SBM page 0
2-112 AdvFS On-Disk Structures
Solutions
--------------------------------------------------------------------------
blocks 50000 - 50015 (0xc350 - 0xc35f) are mapped by SBM map entry 97, bit 21
mapInt[97] 11111111 11111111 11111111 11111111
block 50000 ^
48. Use showfile to see the extents of your miscellaneous bitfile. Find them at .tags/-11, and so forth.
# showfile -x /usr/dennis/.tags/M-11
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
fffffff5.0000 1 16 4 simple ** ** ftx 50% M-11
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 2 1 0 32
2 2 1 64 32
extentCnt: 2
#
#
# showfile -x /usr/dennis/.tags/M-17
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
ffffffef.0000 2 16 4 simple ** ** ftx 50% M-17
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 2 2 0 32
2 2 2 64 32
extentCnt: 2
#
#
# showfile -x /usr/dennis/.tags/M-23
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
ffffffe9.0000 3 16 4 simple ** ** ftx 50% M-23
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
0 2 3 0 32
2 2 3 64 32
extentCnt: 2
AdvFS On-Disk Structures 2-113
Solutions
2-114 AdvFS On-Disk Structures
3
AdvFS In-Memory Structures
AdvFS In-Memory Structures 3-1
About This Chapter
e
About This Chapter
IntroductionThis chapter presents information about the AdvFS in-memory structures. These structures can be examined in a live system or in a crash dump. The structures can be conceptually layered just as the on-disk structures are:
• Virtual file system (VFS)
• File access subsystem (FAS)
• Bitfile access subsystem (BAS)
An understanding of these structures will provide essential information for detailed troubleshooting.
ObjectivesTo describe AdvFS in-memory structures, you should be able to:
• Describe VFS structures.
• Describe FAS layer in-memory structures.
• Describe BAS layer in-memory structures.
• List other in-memory structures:
— Free space cache
— Bitfile buffer descriptor
— I/O descriptor
ResourcesFor more information on topics in this chapter as well as related topics, see thfollowing:
• Advanced File System Administration
• AdvFS Reference Pages
• Header Files
3-2 AdvFS In-Memory Structures
Examining AdvFS In-Memory Structures
Examining AdvFS In-Memory Structures
OverviewAs discussed in the AdvFS On-Disk Structures chapter, the AdvFS file system software is arranged in a hierarchy of layers. This section looks at some of the in-memory data structures representing the on-disk data. When reading a crash, this may be the only information available.
• Overview of in-memory structures
• Big picture
Overview of In-Memory StructuresRecall that all I/O will flow through the VFS software which directs the flow to the appropriate file system specific software. The following file system layers are supported by in-memory structures:
Big Picture of Data Structure LinkageThe following figure will serve as a general reference for the linkage of the data structures to be studied.
VFS layer Vnode and mount structures
FAS layer POSIX file and fileset structures
BAS layer Bitfile and bitfile-set structures
AdvFS In-Memory Structures 3-3
Examining AdvFS In-Memory Structures
Figure 3-1: Big Picture
(dbx) func thread_block
super_task
thread thread
task
proc
utask
file table(systemwide)
vnode
bfNode
bfAccess
fsContext
Extents
rootfs mount mount
fileSetNode
bfSet
domain
vd
3-4 AdvFS In-Memory Structures
Checking the VFS Layer
Checking the VFS Layer
OverviewThe Virtual File System (VFS) acts as a director of subsequent file system activities. This software checks to see for which type of file system the I/O is destined, and directs the logic flow accordingly.
• VFS-specific structures
• vnode structure
• mount structure
VFS Specific StructuresThe VFS layer consists primarily of the following structures and fields:
File descriptor:
• Returned from the open(2) system call
• Used in the per-process utask structure
• Points (indirectly) to file structure
The following example shows an open(2) system call returning a file descriptor to the caller.
Example 3-1: Open System Call Returning a File Descriptor
int main(void){int fd, uid, pid, bytesread;
fd = open("/usr/bruden/ob_1", O_RDWR | O_CREAT, 0777);if (fd == -1)
{perror("open failed ");exit(EXIT_FAILURE);}
printf("file opened -- file descriptor is %d.\n",fd);}
File structure:
• Contains file credentials
• Contains file offset
• Points to vnode
AdvFS In-Memory Structures 3-5
Checking the VFS Layer
The vnode has a file system-specific extension leading to FAS in-memory information.
Find the file structure by using the file descriptor as an index into an array of pointers in the utask structure. This linkage has changed significantly in V5 due to support for up to 64K concurrently opened files per process. The following example shows the fields of the file structure and points out the f_data field, which will point to the vnode for the open file.
Example 3-2: Fields of the File Structure
(dbx) whatis struct filestruct file { simple_lock_data_t f_incore_lock; int f_flag; uint_t f_count; int f_type; int f_msgcount; struct ucred * f_cred; struct fileops * f_ops; caddr_t f_data; <== This field will most likely point to a vnode union { off_t fu_offset; struct file * fu_freef; } f_u; uint_t f_io_lock; int f_io_waiters;};
The following example shows a path to the utask structure which contains open file information.
Example 3-3: Using utask to Get Open File Information
(dbx) set $pid=953(dbx) p (*(struct super_task *)thread.task).utaskstruct { uu_comm = "openone" uu_maxuprc = 64 uu_logname = 0xfffffc0002fcc4a0 = "root"
(...) uu_file_state = struct {
(...) uf_entry = { [0] 0xfffffc00017bd700 <== Indirectly points to file structure [1] (nil) [2] (nil) [3] (nil) [4] (nil) [5] (nil) [6] (nil) [7] (nil) }
3-6 AdvFS In-Memory Structures
Checking the VFS Layer
Here is a route to the vnode using the file descriptor (number 3) to get to the file structure and from there to the vnode. This can be crafted into a dbx macro which expects to be given the PID of the process and the file descriptor number.
Example 3-4: Getting to the vnode
(dbx) p *(struct vnode *)(*(struct super_task *)thread.task) .utask.uu_file_state.uf_entry[0][3].ufe_ofile.f_datastruct { v_lock = 0
(...) v_type = VREG <== Regular file v_tag = VT_MSFS <== AdvFS v_mount = 0xfffffc0005ab2a80 <== To mount structure v_mountedhere = (nil) v_op = 0xfffffc00006b01d0 v_freef = (nil) v_freeb = (nil) v_mountf = 0xfffffc000367bb00 v_mountb = 0xfffffc0001c3a958 (...) v_data = "ˆ" <== Begins file system specific information
vnode StructureEvery open file is represented by a vnode. If the file is within an AdvFS file system, the vnode will have a file system-specific extension called a bfNode structure.
Characteristics of the vnode structure include:
• Contains one per opened file
• Points to mount structure
• Points to VM object
• Points to vnode switch table
• Ends with a file system specific private area extension
mount StructureEach mounted file system is represented by a mount structure. All mount structures are located from a singly linked listhead named rootfs.
Characteristics of the mount structure include:
• Contains one per active file system
• Points to root vnode
• Starts linked list of file system vnodes
• Points to VFS switch table
• Points to file system-specific private area
AdvFS In-Memory Structures 3-7
Checking the VFS Layer
The following example uses rootfs to locate a mount structure. Note that we are examining the third mount structure. The structures are linked through the m_nxt field.
Example 3-5: Viewing a mount Structure Using rootfs
(dbx) p *rootfs.m_nxt.m_nxtstruct { m_lock = 18446739675758144512 m_flag = 20480 m_funnel = 0 m_nxt = 0xfffffc0005ab2d80 <== To next mount structure m_prev = 0xfffffc0005ab3380 m_op = 0xfffffc00006af990 m_vnodecovered = 0xfffffc0001a8f200 m_mounth = 0xfffffc0004d5a240 m_vlist_lock = 0 m_exroot = 0 m_uid = 0 m_stat = struct { f_type = 10 f_flags = 20480 f_fsize = 512 f_bsize = 8192 f_blocks = 1426112 f_bfree = 400688 f_bavail = 324960 f_files = 582215 f_ffree = 558528
(...) f_mntonname = 0xfffffc0000d34c20 = "/usr" f_mntfromname = 0xfffffc0000d34940 = "usr_domain#usr" (...) msfs_args = struct { id = struct { id1 = 937059922 id2 = 653520 tag = 1 } }
(...)
3-8 AdvFS In-Memory Structures
Explaining the FAS Layer
Explaining the FAS Layer
OverviewThe FAS layer is the upper of the two (FAS and BAS) AdvFS layers. It is represented by structures for each open file and each mounted fileset. These hold AdvFS specific information while the VFS structures hold information common among file systems.
• FAS layer structures
• bfNode structure
• fsContext structure
• In-Memory Per File Structures
• fileSetNode structure
FAS Layer StructuresThe FAS layer has structures to represent the open file within the context of user files and directories and filesets. It enables access to the lower-level, bitfile-based structures. FAS layer structures include:
bfnode structure:
• AdvFS vnode
• Points to BAS layer bsAccess structure
Fileset context:
• Points to parent fileset
• fsContext structure
Fileset node:
• AdvFS private mount information
• fileSetNode structure
In-Memory Per File StructuresThe file access subsystem provides the interface to the storage system, the bitfile access subsystem. It provides per-file statistics and directories. It also provides access to symbolic links stored in bitfile metadata table entries.
AdvFS In-Memory Structures 3-9
Explaining the FAS Layer
bfNode StructureThe first private area (FAS level) of an AdvFS file is the bfNode structure. This structure has changed in V5. Most notably, the first field is now a pointer rather than an AdvFS handle.
The following is an excerpt from ms_osf.h.
Example 3-6: Fields of the bfNode Structure
/* * bfNode is the msfs structure at the end of a vnode */
typedef struct bfNode { struct bfAccess *accessp; struct fsContext *fsContextp; bfTagT tag; bfSetIdT bfSetId;} bfNodeT;
Source location: msfs/ms_osf.h
The following example uses an alias to get at the bfNode structure for an open file.
Example 3-7: Accessing bfNode Structure Using an Alias
(dbx) alias v5_get_ofile_bfNode_struct(pidd,fd) "set $pid=pidd; p *(struct bfNode *)&((struct vnode *)(*(struct super_task *)thread.task).utask.uu_file_state.uf_entry[0][fd].ufe_ofile.f_data).v_data"
(dbx)v5_get_ofile_bfNode_struct(953,3)struct { accessp = 0xfffffc0004d94d88 fsContextp = 0xfffffc0004d5ae70 tag = struct { num = 23704 seq = 32770 } bfSetId = struct { domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dirTag = struct { num = 1 seq = 32769 } }}(dbx) set $bfaccess=0xfffffc0004d94d88(dbx) set $fscontext=0xfffffc0004d5ae70
3-10 AdvFS In-Memory Structures
Explaining the FAS Layer
fsContext StructureEach file will have basic ownership and permission information available in a memory-based structure. There is also reference to the directory and fileset available through the fsContext structure.
Characteristics of the fsContext structure include:
• Located through the bfNode structure
• UNIX (POSIX) information about a file (rather than the bitfile)
• Contains:
— Quota information
— Tag of fileset
— Tag of file’s parent directory
— File statistics
The following example contains excerpts from fs_dir.h. It shows the fields of the fsContext data structure. Note the pointer to the fileSetNode structure.
Example 3-8: Fields of the fsContext Structure
struct fsContext { short initialized; /* zero if fsContext is not initialized */ short quotaInitialized; /* zero if quota stuff is not initialized */ bfTagT undel_dir_tag; /* tag of undelete directory */ long fs_flag; /* flag word - see below */ int dirty_stats; /* flag for directories, says update the stats in the parent directory entry */ int dirty_alloc; /* set if stats from an allocating write are not on disk (ICHGMETA) */
lock_data_t file_lock; /* Use an OSF complex lock (read_write_lock) */
long dirstamp; /* stamp to determine directory changes */ mutexT fsContext_mutex; /* mutex to take out locks on this structure */#ifdef ADVFS_DEBUG char file_name[30]; /* first 29 chars of file name */#endif bfTagT bf_tag; /* the tag for the file */ long last_offset; /* the offset of the last found entry */ struct fs_stat dir_stats; /* stats */ struct fileSetNode *fileSetNode; /* pointer to per-fileset info */ struct dQuot *diskQuot[MAXQUOTAS]; /* pointers to quota structs */};
Source location: msfs/fs_dir.h.
The following displays the fsContext structure.
AdvFS In-Memory Structures 3-11
Explaining the FAS Layer
Example 3-9: Displaying the fsContext Structure
(dbx)p *(struct fsContext *)$fscontextstruct { initialized = 1 quotaInitialized = 1 undel_dir_tag = struct { num = 0 seq = 0 } fs_flag = 0
(...) bf_tag = struct { num = 23704 seq = 32770 } last_offset = 0 dir_stats = struct { st_ino = struct { num = 23704 seq = 32770 } st_mode = 33261 st_uid = 0 st_gid = 0 st_rdev = 0 st_size = 31114
(...) fragId = struct { frag = 58113 type = BF_FRAG_7K } st_nlink = 1 st_unused_1 = 0 fragPageOffset = 3 st_unused_2 = 0 } fileSetNode = 0xfffffc0005ab5088 diskQuot = { [0] 0xfffffc0005ac7088 [1] 0xfffffc0005ac7148 }}
The major in-memory structure providing access to bitfiles is the bfAccess structure. It also points to the extent map. The bfNode structure is the main route from the FAS structures to the BAS structures through the bsAccess structure. These relationships are shown in the following figure.
3-12 AdvFS In-Memory Structures
Explaining the FAS Layer
Figure 3-2: In-Memory Per File Structures
In-Memory Per Fileset StructuresCharacteristics of the in-memory, per fileset structures:
• The mount structure for this file system is linked to the mount structure of the root file system (m_nxt) which is found through the global symbol rootfs.
• The mounted file system’s mount structure (m_data) points to the file system- specific mount structure, in this case the AdvFS fileSetNode structure.
• The vnode of the mounted-upon directory (v_mountedhere) is set to point to the mounted file system’s mount structure to represent where the file system has been mounted.
• This mount structure (m_vnodecovered) points back to the vnode.
• Attached to the vnodes of the active files of the mounted file system are bfNode data structures, which represent the files.
DiskBlock
Extent Map
bfAccess
vnode
bfNode
fsContext
mount
fileSetNode
bfSet
domain
Per File Fileset and Domain
AdvFS In-Memory Structures 3-13
Explaining the FAS Layer
fileSetNode StructureAn AdvFS fileset is represented within the FAS layer by the fileSetNode structure. Characteristics of the fileSetNode structure include the AdvFS specific mount structure, which has pointers to:
• domain structure
• vnode for root
• mount structure
The following is an excerpt from ms_osf.h describing the fields of the fileSetNode structure.
Example 3-10: Fields of the fileSetNode Structure
typedef struct fileSetNode { struct fileSetNode *fsNext; struct fileSetNode **fsPrev; bfTagT rootTag; /* tag of root directory */ bfTagT tagsTag; /* tag of ".tags */ uint_t filesetMagic; /* magic number: structure validation */ domainT *dmnP; bfAccessT *rootAccessp; /* Access structure pointer for root */ bfSetIdT bfSetId; bfSetT *bfSetp; /* bitfile-set descriptor pointer */ struct vnode *root_vp; int fsFlags; /* flags, see below */ struct mount *mountp; /* mount table pointer */ unsigned quotaStatus; /* see definitions below */ long blkHLimit; /* maximum quota blocks in fileset */ long blkSLimit; /* soft limit for fileset blks */ long fileHLimit; /* maximum number of files in fileset */ long fileSLimit; /* soft limit for fileset files */ long blksUsed; /* number of quota blocks used */ long filesUsed; /* number of bitfiles used */ time_t blkTLimit; /* time limit for excessive disk blk use */ time_t fileTLimit; /* time limit for excessive file use */ mutexT filesetMutex; /* protect next two fields */ quotaInfoT qi[MAXQUOTAS]; fileSetStatsT fileSetStats;} fileSetNodeT;
Source location: msfs/ms_osf.h.
The following example displays the fileSetNode structure.
3-14 AdvFS In-Memory Structures
Explaining the FAS Layer
Example 3-11: Displaying the fileSetNode Structure
(dbx) p *(*(struct fsContext *)$fscontext).fileSetNodestruct { fsNext = (nil) fsPrev = 0xfffffc0005ab5348 rootTag = struct { num = 2 seq = 32769 } tagsTag = struct { num = 3 seq = 32769 } filesetMagic = 2918187013 <== 0xadf00005 dmnP = 0xfffffc0000f24008 rootAccessp = 0xfffffc0005af7688 bfSetId = struct { domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dirTag = struct { num = 1 seq = 32769 } } bfSetp = 0xfffffc0005b7ca08 root_vp = 0xfffffc0005ac98c0 fsFlags = 0 mountp = 0xfffffc0005ab2a80 quotaStatus = 1421 blkHLimit = 0 blkSLimit = 0 fileHLimit = 0 fileSLimit = 0 blksUsed = 1025520 filesUsed = 23689
(...) fileSetStats = struct { msfs_lookup = 19671 lookup = struct { hit = 17494 hit_not_found = 1079 miss = 1098 } msfs_create = 2 msfs_mknod = 0
(...) }}(dbx) set $bfset=0xfffffc0005b7ca08(dbx) set $domain=0xfffffc0000f24008
AdvFS In-Memory Structures 3-15
Explaining the FAS Layer
Figure 3-3: In-Memory Per Fileset Structures
Fileset Quota StructuresSince fileset quotas are a per-fileset capability, the information describing them is held within the fileSetNode structure.
Characteristics of the fileset quota structures include:
• fileSetNode contains an array of quotaInfo structures.
Identify fileset quotas
• fileSetNode contains several limit fields.
For fields set by chfsets
Source locations:
• msfs/msfs/ms_osf.h contains defines.
• msfs/fs/fs_quota.h contains routines.
PBQ[W
URRWIV
0RXQW
6WUXFWXUH
PBLQIR
YBPRXQW
LQRGH
YBGDWD
YBPRXQW
YBPRXQWHGKHUH
LQRGH
YBGDWD
YBPRXQW
LQRGH
YBGDWD
YQRGHV
XIVPRXQW
PBQ[W 0RXQW
6WUXFWXUHPBYQRGHFRYHUHG
PBLQIR
YBPRXQW
EI1RGH
YBGDWD
YBPRXQW
EI1RGH
YBGDWD
YBPRXQW
EI1RGH
YBGDWD
YQRGHV
ILOH6HW1RGH
3-16 AdvFS In-Memory Structures
Explaining the FAS Layer
User and Group Quota StructuresSince user and group quotas are pertinent to file usage, the information describing them is in the fsContext structure.
Characteristics of the user/group quota structures include:
• fsContext points to dQuot structures.
• Kernel maintains disk quota cache:
— Table is DqHashTbl
— Access via dqget()
• msfs/msfs/fs_quota.h contains includes.
• msfs/fs/fs_quota.c contains routines.
AdvFS In-Memory Structures 3-17
Locating the BAS Layer
Locating the BAS Layer
OverviewThe lowest layer of AdvFS is the BAS layer. The files are represented at this layer by the bsAccess structures. This large structure locates the other BAS structures supporting bitfiles, bitfile-sets, domains, and volumes.
• BAS layer structure overview
• Access to BAS structures
• bfAccess structure
• Managing bfAccess structures
• bfSet structures
• Finding bfSet structures
• domain structures
• Finding domain structures
• vd structures
BAS Layer Structure OverviewBAS layer structures include:
• bfAccess structure
• bitfile structure
• One per open file
bfSet
• bitfile-set structure
• One per fileset
domainT
• File domain
• One per domain
vd structure
• Virtual disks
• One per AdvFS volume
3-18 AdvFS In-Memory Structures
Locating the BAS Layer
Access to BAS StructuresV5 of Tru64 UNIX, which supports both V3 and V4 of AdvFS, provides access to the BAS structures through pointers rather than handles. Most handles had to be divided into bit sequences to be used.
bfAccess StructureCharacteristics of the bfAccess structure include the in-memory state of a bitfile, which contains:
• Links to other bitfile access structures
• Pointer to a vnode
• Pointer to a vm object
• Highest LSN written to a log
• Pointers to next clone’s bfAccess structure (if a clone)
• Bitfile set pointer
• Domain pointer
• Primary metadata cell ID
• Volume containing primary mcell
Source location: msfs/bs_access.h
Managing bfAccess StructuresThe bfAccess structures are allocated as needed.
• Free access list
Linked list of available structures
• Closed access list
Closed and dirty bitfiles that need a bit more work before freeing
Source location: msfs/bs/bs_access.c
The following example uses the bfAccess structure to get extent information about an open file. The address of the bfAccess structure is found in the bfNode structure.
AdvFS In-Memory Structures 3-19
Locating the BAS Layer
Example 3-12: Accessing Extent Data Through bfAccess
(dbx) p *(*(struct bfAccess *)$bfaccess).xtnts.xtntMap.subXtntMap[0].bsXA[0]struct { bsPage = 0 vdBlk = 222416}(dbx) q# # # pwd/usr/bruden# showfile -x ob_1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File 5c98.8002 1 16 3 simple ** ** async 100% ob_1
extentMap: 1 pageOff pageCnt vol volBlock blockCnt 0 3 1 222416 48 extentCnt: 1#
bfSet StructureCharacteristics of the bfSet structure include the lower-level BAS structure for a fileset. This includes:
• Pointer to the fileset’s domain
• Tag and references to this set’s tag directory
• Cloned/master state
• Fragment file information
• Back pointer to fileset node
Source location: msfs/msfs/bs_bitfile_sets.h
Finding bfSet StructuresThe elements of BfSetHashTbl indirectly point to bfSet structures.
The following macro generates a hash key value for accessing the BfSetHashTbl.
3-20 AdvFS In-Memory Structures
Locating the BAS Layer
Example 3-13: Hash Key for BfSetHashTbl
#define BFSET_GET_HASH_KEY( _bfSetId ) ( (_bfSetId).domainId.tv_sec + (_bfSetId).dirTag.num )
Source location: msfs/bs/bs_bitfile_sets.h
domain StructureThe domain structure includes domain-specific information:
• Root tag directory references
• Log location and state information
• Overall buffering state information
Source location: msfs/msfs/bs_domain.h
There is another structure domain in the networking side of UNIX. Reference the AdvFS structure with the symbol name domainT.
Finding domain StructuresSource location: msfs/bs/bs_domain.c
The following example displays the fields in the domain structure. The address of the domain structure is found in the bfAccess structure which is pointed to by the bfNode extension of the vnode.
Example 3-14: Displaying the domain Structure
(dbx) p *(domainT *)$domainstruct { mutex = struct { mutex = 0 } dmnMagic = 2918187011 dmnFwd = 0xfffffc0000f24008 dmnBwd = 0xfffffc0000f24008 dmnHashlinks = struct { dh_links = struct { dh_next = 0xfffffc0000f24008 dh_prev = 0xfffffc0000f24008 } dh_key = 937059922 } dmnVersion = 4 state = BFD_ACTIVATED domainId = struct { tv_sec = 937059922 tv_usec = 653520 } dualMountId = struct { tv_sec = 0 tv_usec = 0
AdvFS In-Memory Structures 3-21
Locating the BAS Layer
} bfDmnMntId = struct { tv_sec = 937392621 tv_usec = 874423 } dmnAccCnt = 4 dmnRefWaiters = 0 activateCnt = 2 mountCnt = 2 bfSetDirp = 0xfffffc0005b7c788 bfSetDirTag = struct { num = 4294967288 seq = 0 }
(...) bfSetHead = struct { bfsQfwd = 0xfffffc0005b7cce8 bfsQbck = 0xfffffc0005b7c7e8 } bfSetDirAccp = 0xfffffc0005af8488 ftxLogTag = struct { num = 4294967287 seq = 0 } ftxLogP = 0xfffffc0005baec48 ftxLogPgs = 512 logAccessp = 0xfffffc0005af8908
(...) domainName = "usr_domain" majorNum = 2055 flag = BFD_NORMAL lsnLock = struct { mutex = 0 } lsnList = struct { lsnFwd = 0xfffffe04075c0e68 lsnBwd = 0xfffffe04075c0e68
(...) vdCnt = 1 vdpTbl = { [0] 0xfffffc0000f2b508 [1] (nil) [2] (nil)
(...) bcStat = struct { pinHit = 14402 pinHitWait = 784 pinRead = 0 refHit = 24821 refHitWait = 69 raBuf = 2458 ubcHit = 1418 unpinCnt = struct { lazy = 14162 blocking = 66
3-22 AdvFS In-Memory Structures
Locating the BAS Layer
clean = 11 log = 2172 } derefCnt = 28516 devRead = 3025 devWrite = 3229
(...) bmtStat = struct { fStatRead = 0 fStatWrite = 5316 resv1 = 0 resv2 = 0 bmtRecRead = { [0] 0 [1] 0
(...) [21] 0 } bmtRecWrite = { [0] 0 [1] 0 [2] 97 [3] 0
(...) [21] 0 } } logStat = struct { logWrites = 313 transactions = 9860 segmentedRecs = 3 logTrims = 0 wastedWords = 27558 maxLogPgs = 102 minLogPgs = 0 maxFtxWords = 2127 maxFtxAgent = 91 maxFtxTblSlots = 15 oldFtxTblAgent = 34 excSlotWaits = 0 fullSlotWaits = 0 rsv1 = 0 rsv2 = 0 rsv3 = 0 rsv4 = 0 } totalBlks = 1426112 freeBlks = 324480 dmn_panic = 0
(...) smsync_policy = 0 metaPagep = 0xfffffe0400299008 fs_full_time = 0}
AdvFS In-Memory Structures 3-23
Locating the BAS Layer
vd StructureCharacteristics of the vd structure include per-virtual disk structure. This includes:
• Pointer to device vnode
• Pointer to RBMT, BMT, and SBM bitfiles
• Physical characteristics of device
• I/O queuing information
The following is an excerpt from bs_vd.h showing descriptions of some of the fields in the data structure.
Example 3-15: Fields in the vd Structure
/* * vd - this structure describes a virtual disk, including accessed * bitfile references, its size, i/o queues, name, id, and an * access handle for the metadata bitfile. */
typedef struct vd { /* ** Static fields (ie - they are set once and never changed). */ uint32T stgCluster; /* num blks each stg bitmap bit */ struct vnode *devVp; /* device access (temp vnode *) */ uint_t vdMagic; /* magic number: structure validation */ bfAccessT *rbmtp; /* access structure pointer for RBMT */ bfAccessT *bmtp; /* access structure pointer for BMT */ bfAccessT *sbmp; /* access structure pointer for SBM */ domainT *dmnP; /* domain pointer for ds */ uint32T vdIndex; /* 1-based virtual disk index */ uint32T maxPgSz; /* max possible page size on vd */ uint32T bmtXtntPgs; /* number of pages per BMT extent */ char vdName[BS_VD_NAME_SZ]; /* temp - should be global name */
/* The following fields are protected by the vdT.vdStateLock mutex */ bsVdStatesT vdState; /* vd state */ struct thread *vdSetupThd; /* Thread Id of the thread setting up vdT */ uint32T vdRefCnt; /* # threads actively using this volume */ uint32T vdRefWaiters; /* # threads waiting for vdRefCnt to goto 0 */ mutexT vdStateLock; /* lock for above 4 fields */
/* * The following fields are protected by the vdScLock semaphore * in the domain structure. This lock is protected by the * domain mutex. Use the macros VD_SC_LOCK and VD_SC_UNLOCK. */ uint32T vdSize; /* count of vdSectorSize blocks in vd */ int vdSectorSize; /* Sector size, in bytes, normally 512 */ uint32T vdClusters; /* num clusters in vd */ serviceClassT serviceClass; /* service class provided */
3-24 AdvFS In-Memory Structures
Locating the BAS Layer
ftxLkT mcell_lk; /* used with domain mutex */ int nextMcellPg; /* next available metadata cell’s page num */ ftxLkT rbmt_mcell_lk; /* This lock protects mcell allocation from * the rbmt mcell pool. This pool is used * to extend reserved bitfiles. */ int lastRbmtPg; /* last available reserved mcell’s page num */ int rbmtFlags; /* protected by rbmt_mcell_lk */
ftxLkT stgMap_lk; /* used with domain mutex */ stgDescT *freeStgLst; /* ptr to list of free storage descriptors */ uint32T numFreeDesc; /* number of free storage descriptors in list */ uint32T freeClust; /* total number free clusters */ uint32T scanStartClust; /* cluster where next bitmap scan will start */ uint32T bitMapPgs; /* number of pages in bitmap */ uint32T spaceReturned; /* space has been returned */ stgDescT *fill1; /* ptr to list of reserved storage descriptors */ stgDescT *fill3; /* ptr to list of free, reserved stg descriptors */ uint32T fill4; /* # of free, reserved stg descriptors in list */
ftxLkT del_list_lk; /* protects global defered delete list */
lock_data_t ddlActiveLk; /* Synchs processing of deferred-delete list entries */ /* used with domain mutex */ bfMCIdT ddlActiveWaitMCId; /* If non-nil, a thread is waiting on this entry */ /* Use domain mutex for synchronization */ cvT ddlActiveWaitCv; /* Used when waiting for active ddl entry */
struct dStat dStat; /* collect device statistics */
/* * I/O queues; these fields protected by vdIoLock */ mutexT vdIoLock; /* simple lock for guarding I/O fields. */ ioDescHdrT blockingQ; /* For blocking I/O */ ioDescHdrT waitLazyQ; /* Transactional buffers w/ too high lsn */ ioDescHdrT smSyncQ[SMSYNC_NQS]; /* smooth sync queues */ ioDescHdrT readyLazyQ; /* Sorted, ready for consolidation */ ioDescHdrT consolQ; /* Consolidated, ready to be written */ ioDescHdrT devQ; /* Tracks device */ int blockingCnt; /* keep track of how many times we can take */#define BLOCKFACT 4 int blockingFact; /* from blocking q before taking from consol q */ int rdmaxio; /* max blocks that can be read/written */ int wrmaxio; /* in a consolidated I/O */ int vdIoOut; /* There are outstanding I/O’s on this vd */ int start_active; /* Recursion preventer */ int gen_active; /* I/O generation loop active */ stateLkT active; /* indicates when disk (or lazy thread) is busy */ short advfs_start_more_posted; /* 0 = no message yet posted */ /* 1 = message posted but not processed */ /* 2 = vd_free in progress */
AdvFS In-Memory Structures 3-25
Locating the BAS Layer
#ifdef ADVFS_DEBUG enum deFlags errorFlag; int errorCount; int errorRepeat;#endif /* ADVFS_DEBUG */
u_long blkQ_cnt; /* count of bufs placed onto blockingQ */ u_long lazyQ_cnt; /* count of bufs placed onto lazyQ */ u_long smsyncQ_cnt; /* count of bufs placed onto smsyncQ */ u_long readyQ_cnt; /* count of bufs placed onto readyQ */ u_long consolQ_cnt; /* count of bufs placed onto consQ */ u_long devQ_cnt; /* count of bufs placed onto devQ */ u_long rmioq_cnt; /* count of bufs rm_ioq’ed */ u_long rmormvq_cnt; /* count of bufs rm_or_moveq’ed */ u_int syncQIndx; /* next smsync queue to be processed */ /* end of fields protected by vdIoLock */
int consolidate; /* Flag, one indicates disk can take big io’s */ int max_iosize_rd; /* From device */ int max_iosize_wr; /* From device */ int preferred_iosize_rd; /* From device */ int preferred_iosize_wr; /* From device */ int qtodev; /* max number of I/O’s to be queued to dev */
stgDescT freeRsvdStg; /* desc for free rsvd stg for rsvd files */#ifdef ADVFS_VD_TRACE uint32T trace_ptr; vdTraceElmtT trace_buf[VD_TRACE_HISTORY];#endif} vdT;
3-26 AdvFS In-Memory Structures
Defining Other In-Memory Structures
Defining Other In-Memory Structures
Free Space CacheCharacteristics of the free space cache include per-volume, in-core structure (structure stgDesc).
• Linked list of contiguous free clusters
• Each entry gives the starting block and size of each free area
To avoid costly I/Os and bitmap scanning when searching for free space on a volume, AdvFS uses an in-memory free space cache to keep track of free space on a volume. The cache is a linked list of free space extents. It is a cache because it has a limited number of entries so it does not represent all the free space on a volume. The entries in the cache are sorted by cluster number.
The free space cache is filled whenever it becomes empty (or when it is explicitly invalidated). It is filled by scanning the bitmap and creating cache entries for free space extents in the bitmap.
The following example describes the storage descriptor structure found in msfs/msfs/bs_vd.h. This structure uses data drawn from the SBM.
Example 3-16: Fields in the stgDesc Structure
/* * stgDescT - Describes a contiguous set of available (free) vd blocks. * These structures are used to maintain a list of free disk space. There * is a free list in each vd structure. The list is ordered by virtual * disk block (it could also be ordered by the size of each contiguous * set of blocks in the future). Refer to the "sbm_" routines in * bs_sbm.c. */
typedef struct stgDesc { uint32T start_clust; /* vd cluster number of first free cluster */ uint32T num_clust; /* number of free clusters */ struct stgDesc *prevp; struct stgDesc *nextp;} stgDescT;
Bitfile Buffer DescriptorCharacteristics of the bitfile buffer descriptor include:
• Descriptor for bitfile pages
• Structure bsBuf
— Lots of fields for doubly linked lists
— Log record addresses
AdvFS In-Memory Structures 3-27
Defining Other In-Memory Structures
ge
— Page address
Domain, bitfile-set, fileset, page
— Physical location
— bfAccess structure
— I/O descriptor information and queues
Migrating pages may have more than one I/O descriptor.
This structure gives information about the in-core information of an AdvFS pastored in the primary cache. Pinned pages may be found here.
Source location: msfs/msfs/bs_buf.h
I/O DescriptorCharacteristics of the I/O descriptor include:
• Links for the I/O queues
• Block descriptor
— Virtual disk
— Block
• Address of buffer
• Pointer to bsBuf structure
The source for the structure ioDesc is found in msfs/msfs/bs_ims.h.
FTX State StructureCharacteristics of the FTX state structure include:
• Structure ftx or ftxStateT
• Fields include log record numbers:
— First and last written
— Undo back link
Source location: msfs/msfs/ftx_privates.h
3-28 AdvFS In-Memory Structures
Summary
Summary
Examining AdvFS In-Memory StructuresRecall that all I/O will flow through the VFS software which directs the flow to the appropriate file system-specific software. The following file system layers are supported by in-memory structures.
Checking the VFS LayerThe virtual file system acts as a director of subsequent file system activities. This software checks to see for which type of file system the I/O is destined, and directs the logic flow accordingly.
• VFS specific structures
• The vnode structure
• The mount structure
Explaining the FAS LayerThe FAS layer in-memory structures include those shown here.
Locating the BAS LayerThese BAS layer in-memory structures include those shown here.
VFS layer Vnode and mount structures
FAS layer POSIX file and fileset structures
BAS layer Bitfile and bitfile-set structures
bfNode Bitfile node pointer
fsContext Fileset context (points to parent fileset)
fileSetNode Fileset node (AdvFS private mount information)
dQuot Disk quota cache
bfAccess Provides access to bitfiles
bfSet Bitfile-set structure
domainT File domain
vd Virtual disks
AdvFS In-Memory Structures 3-29
Summary
Defining Other In-Memory StructuresTo avoid costly I/Os and bitmap scanning when searching for free space on a volume, AdvFS uses an in-memory free space cache to keep track of free space on a volume. The cache is a linked list of free space extents.
The bitfile buffer descriptor gives information about the in-core information of an AdvFS page stored in the primary cache. Pinned pages may be found here.
I/O descriptor contains links for the I/O queues.
FTX state structure ftx describes transactions for logging.
3-30 AdvFS In-Memory Structures
Exercises
Exercises
Start a process that opens an AdvFS file but does not close it. If you are a C or Korn shell user, start a cat process and press ^Z to suspend it (or use the more command).
% cd some-advfs-directory% cat > testy.fileHere is one line.^Z
Note the inode number of the file you have created and the ID of the process writing the file.
1. Determine the address of the file structure associated with the file. Use the ofile command of kdbx or use the dbx techniques shown in class. If you followed the suggestion above when creating the file, the address of the file structure will be found in file descriptor 1, standard output.
# kdbx -k /vmunix ....(kdbx) ofile -pid 19279
Proc=0xfffffc0002590ca0 pid=19279 ofile[ 0]=0xfffffc0003269680 ofile[ 1]=0xfffffc0003269400 ofile[ 2]=0xfffffc0003269680
2. Now print the file structure. Examine its values to make sure they seem reasonable. You may continue to use kdbx, however, if you encounter problems, try dbx. It does not crash as frequently.
# dbx -k /vmunix .......(dbx) set $f = (struct file *)0xfffffc0003269400(dbx) print *(struct file *)$f
3. The f_data field of the file structure points to the file’s vnode. Save and print the address of the vnode. It’s useful to print the address so you’ll have it handy in case dbx crashes. Now print the vnode structure.
(dbx) set $v=(struct vnode *)(((struct file *)$f)->f_data)(dbx) p $v0xfffffc0002964e00 (dbx) print *(struct vnode *)$v
You should not have to type the (struct vnode *) type case.
Feel free to use a supplied alias, or create your own.
AdvFS In-Memory Structures 3-31
Exercises
4. The bfNode and fileset context structures are in the private area of the vnode. Set a pointer to the bfNode and print its contents. The bfNode is an extension to the vnode and contains a pointer to the fsContext structure.
(dbx) set $bf=(struct bfNode *)&(((struct vnode *)$v)->v_data)(dbx) p $bf0xfffffc0002964eb8 (dbx) p *(struct bfNode *)$bf
Note the access and fileset context pointer for your file.
5. Verify that you are looking at the right file by matching the tag number, as shown by showfile, with the tag you see with dbx.
6. Obtain a pointer to the fileset context and print it.
(dbx) set $fc=(struct fsContext *)(($bf)->fsContextp)(dbx) p $fc0xfffffc0002964ee0(dbx) p *(struct fsContext *)$fcstruct { initialized = 1....... dir_stats = struct { st_ino = struct { num = 13104 seq = 32790 }....... fileSetNode = 0xfffffc0003f26288 diskQuot = { [0] 0xfffffc00014bc988 [1] 0xfffffc00014bca08 }}
Note how information contained in several BMT records related to the file has been placed into one in-memory structure. Verify that the POSIX file statistics information seems reasonable.
7. Print out the two disk quota structures.
(dbx) set $qu=(struct dQuot *)(($fc)->diskQuot[0])(dbx) p *(struct dQuot *)$qu.......(dbx) set $qg=(struct dQuot *)(($fc)->diskQuot[1])(dbx) p *(struct dQuot *)$qg
At this point you have looked at the in-core, FAS-level structures for the file. Now look at the FAS-level structure for the file system or fileset. You could go directly from the fileset context structure, but let’s take the scenic tour through the mount structure.
3-32 AdvFS In-Memory Structures
Exercises
8. There is a pointer to the mount structure inside the vnode. Print the structure. Wake up when you see the few MSFS-specific fields.
(dbx) set $m=(struct mount *)(((struct vnode *)$v)->v_mount)(dbx) print $m(dbx) print *(struct mount *)$m
9. The m_info field of the mount structure contains a pointer to AdvFS private file system information. This information is the fileSetNode. Print it.
(dbx) set $fsn=(struct fileSetNode *)(((struct mount *)$m)->m_info)(dbx) px $fsn0xfffffc0003f26288 (dbx) p *(struct fileSetNode *)$fsnstruct {........ domainId = struct { tv_sec = 865089685 tv_usec = 897520 }....... bfSetH = struct { setH = 4 dmnH = 2 } root_vp = 0xfffffc0001451200....... fileSetStats = struct { msfs_lookup = 8721216.......
You will see all sorts of interesting information: domain ID, bitfile-set handle, pointer to the file system’s root directory, and lots of statistics.
10. Use showfdmn to verify you have a domain ID match.
The significant FAS-level structures have now been studied.
11. Print the access structure and see if its tag number matches your target.
(dbx) p *(struct bfAccess *)$bfastruct { fwd = 0xffffffff8077a3a8 bwd = 0xfffffc0000601c90....... tag = struct { num = 12572 seq = 32797 }.......
AdvFS In-Memory Structures 3-33
Exercises
12. Examine the back pointers to the vnode and VM object. Use these fields for additional confirmation that you have the right target. You will also see pointers to extent map information and the bitfile-set and domain structures.
(dbx) p *(struct bfAccess *)$bat
13. Print the extent map.
For most bitfiles this will not be a very exciting structure; however, for bitfiles with many extents, this is the beginning of a mass of pointers. Note that the subXtntMap field is an array with validCnt elements. Each subXtntMap structure has an array of extents (bsXA) with cnt elements.
We can now proceed to the bitfile-set, domain, and virtual disk data structures of AdvFS. Use the pointers of the bitfile access structure to find them.
14. Move from the bitfile access structure into the bitfile-set structure. Print it. There is a lot to see. Be sure to use the bitfile-set ID field to verify and match the values returned by showfsets.
(dbx) set $bfs=(bfSetT *)(((struct bfAccess *)$bfa)->bfSetp)(dbx) p $bfs(dbx) p *(bfSetT *)$bfsstruct { bfSetId = struct { domainId = struct { tv_sec = 864927707 tv_usec = 860832 } dirTag = struct { num = 1 seq = 32769 } }....... dmnp = 0xfffffc000123e008
15. Now print the domain structure. (Do not use struct domain unless you want the structure for socket domains.) In the middle of this structure, you will see an array for pointers to virtual disk structures. There are also many fields used to control file domain I/O.
(dbx) set $d=(domainT *)(((bfSetT *)$bfs)->dmnp)(dbx) p $d0xfffffc000123e008 (dbx) p *(domainT *)$dstruct {....... domainName = "usr_domain"....... vdpTbl = { [0] 0xfffffc0003a18388.......
3-34 AdvFS In-Memory Structures
Exercises
16. The last major structure to print is used for virtual disks. You will see even more I/O control substructures here.
(dbx) set $vd=(struct vd *)(((domainT *)$d)->vdpTbl[0])(dbx) p $vd0xfffffc0003a18388 (dbx) p *(struct vd *)$vdstruct {....... vdName = "/dev/disk/dsk2g"........ freeStgLst = 0xfffffc0003fcf188
17. For the finale, print the a data structure of the free storage cache.
(dbx) set $fst=(stgDescT *)(((struct vd *)$vd)->freeStgLst)(dbx) p $fst0xfffffc0003fcf188 (dbx) p *(stgDescT *)$fststruct { start_clust = 705832 num_clust = 14360 prevp = 0xfffffc0003fcf188 nextp = 0xfffffc0003fcf188}
AdvFS In-Memory Structures 3-35
Solutions
Solutions
Start a process that opens an AdvFS file but does not close it. If you are a C or Korn shell user, start a cat process and press ^Z to suspend it (or just use more).
% cd some-advfs-directory% cat > testy.fileHere is one line.^Z
Note the inode number of the file you have created and the ID of the process writing the file.#
# cat openone.c
/* openone.c I */
/* SAMPLE PROGRAM FOR FILE OPEN TESTING */
/* Opens or creates a file named ob_1 in the current directory. */
#include <sys/file.h>
#include <stdlib.h>
int main(void)
{
int fd, uid, pid, bytesread;
fd = open("ob_1", O_RDWR | O_CREAT, 0777);
if (fd == -1)
{
perror("open failed ");
exit(EXIT_FAILURE);
}
printf("file opened -- file descriptor is %d.\n",fd);
uid = getuid();
printf("uid is %d.\n",uid);
pid = getpid();
printf("pid is %d.\n",pid);
printf("Hit any key to close file and terminate program.\n");
getchar();
close(fd);
printf("done\n");
}
#
# cc -o openone openone.c
#
# ./openone&
[1] 816
3-36 AdvFS In-Memory Structures
Solutions
# file opened -- file descriptor is 3.
uid is 0.
pid is 816.
Hit any key to close file and terminate program.
#
[1] + Stopped(SIGTTIN) ./openone&
#
# ps | grep open
816 pts/5 T N 0:00.02 ./openone
818 pts/5 U + 0:00.01 grep open
#
#
# ls -li ob_1
23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1
1. Determine the address of the file structure associated with the file. Use the ofile command of kdbx or use the dbx techniques shown in class. If you followed the suggestion above when creating the file, the address of the file structure will be found in file descriptor 1, standard output.
# kdbx -k /vmunix ....(kdbx) ofile -pid 19279
Proc=0xfffffc0002590ca0 pid=19279 ofile[ 0]=0xfffffc0003269680 ofile[ 1]=0xfffffc0003269400 ofile[ 2]=0xfffffc0003269680
# kdbx -k /vmunix
dbx version 5.0
Type ’help’ for help.
stopped at [thread_block:2709 ,0xfffffc00002c8084] Source not available
warning: Files compiled -g3: parameter values probably wrong
(kdbx)
(kdbx)
(kdbx)
(kdbx) ofile -pid 816
Proc=0xfffffc00059b2c80 pid= 816
ofile[ 0]=0xfffffc000248e040
ofile[ 1]=0xfffffc000248e040
ofile[ 2]=0xfffffc000248e040
ofile[ 3]=0xfffffc0002eef500
(kdbx)
(kdbx)
(kdbx) q
dbx (pid 832) died. Exiting...
#
Through dbx
AdvFS In-Memory Structures 3-37
Solutions
#
# dbx -k /vmunix
dbx version 5.0
Type ’help’ for help.
stopped at [thread_block:2709 ,0xfffffc00002c8084] Source not available
warning: Files compiled -g3: parameter values probably wrong
(dbx)
(dbx)
(dbx)
(dbx) set $pid=816
(dbx)
(dbx) p (*(struct super_task
*)thread.task).utask.uu_file_state.uf_entry[0][3].ufe_ofile
0xfffffc0002eef500
2. Now print the file structure. Examine its values to make sure they seem reasonable. You can continue to use kdbx, however, if it starts giving you trouble, remember that dbx does not crash as frequently.
# dbx -k /vmunix .......(dbx) set $f = (struct file *)0xfffffc0003269400(dbx) print *(struct file *)$f
(dbx) set $f = 0xfffffc0002eef500
(dbx)
(dbx) px $f
0xfffffc0002eef500
(dbx)
(dbx)
(dbx) p *(struct file *)$f
struct {
f_incore_lock = 0
f_flag = 3
f_count = 1
f_type = 1
f_msgcount = 0
f_cred = 0xfffffc000346c3c0
f_ops = 0xfffffc00006c5990
f_data = 0xfffffc00020ec000 = ""
f_u = union {
fu_offset = 0
fu_freef = (nil)
}
f_io_lock = 0
f_io_waiters = 0
}
3. The f_data field of the file structure points to the file’s vnode. Save and print the address of the vnode. It is useful to print the address so you will have it handy in case dbx crashes. Now print the vnode structure.
(dbx) set $v=(struct vnode *)(((struct file *)$f)->f_data)(dbx) p $v
3-38 AdvFS In-Memory Structures
Solutions
0xfffffc0002964e00 (dbx) print *(struct vnode *)$p
You should not have to type the (struct vnode *) type case. Feel free to use a supplied macro or create your own.
(dbx) set $v = 0xfffffc00020ec000
(dbx)
(dbx) p *(struct vnode *)$v
struct {
v_lock = 0
v_flag = 0
v_usecount = 1
v_aux_lockers = 0
v_shlockc = 0
v_exlockc = 0
v_lastr = 0
v_id = 29695
v_type = VREG
v_tag = VT_MSFS
v_mount = 0xfffffc0005ab2a80
v_mountedhere = (nil)
v_op = 0xfffffc00006b01d0
v_freef = (nil)
v_freeb = (nil)
v_mountf = 0xfffffc0002218900
v_mountb = 0xfffffc0004ad4058
v_buflists_lock = 0
v_cleanblkhd = (nil)
v_dirtyblkhd = (nil)
v_ncache_time = 1765
v_free_time = 1286
v_output_lock = 0
v_numoutput = 0
v_outflag = 0
v_cache_lookup_refs = 0
v_rdcnt = 1
v_wrcnt = 1
v_dirtyblkcnt = 0
v_dirtyblkpush = 0
v_un = union {
vu_socket = (nil)
vu_specinfo = (nil)
vu_fifonode = (nil)
}
v_object = 0xfffffc00027f9380
v_secattr = (nil)
v_data = "
}
(dbx)
(dbx)
(dbx) alias v5_get_ofile_vnode_struct(pidd,fd) "set $pid=pidd;p *(struct vnode
*)(*(struct super_task
*)thread.task).utask.uu_file_state.uf_entry[0][fd].ufe_ofile.f_data"
AdvFS In-Memory Structures 3-39
Solutions
(dbx)
(dbx)
(dbx) v5_get_ofile_vnode_struct(816,3)
struct {
v_lock = 0
v_flag = 0
v_usecount = 1
v_aux_lockers = 0
v_shlockc = 0
v_exlockc = 0
v_lastr = 0
v_id = 29695
v_type = VREG
v_tag = VT_MSFS
v_mount = 0xfffffc0005ab2a80
v_mountedhere = (nil)
v_op = 0xfffffc00006b01d0
v_freef = (nil)
v_freeb = (nil)
v_mountf = 0xfffffc0002218900
v_mountb = 0xfffffc0004ad4058
v_buflists_lock = 0
v_cleanblkhd = (nil)
v_dirtyblkhd = (nil)
v_ncache_time = 1765
v_free_time = 1286
v_output_lock = 0
v_numoutput = 0
v_outflag = 0
v_cache_lookup_refs = 0
v_rdcnt = 1
v_wrcnt = 1
v_dirtyblkcnt = 0
v_dirtyblkpush = 0
v_un = union {
vu_socket = (nil)
vu_specinfo = (nil)
vu_fifonode = (nil)
}
v_object = 0xfffffc00027f9380
v_secattr = (nil)
v_data = "
}
4. The bfNode and fileset context structures are in the private area of the vnode. Set a pointer to the bfNode and print its contents. The bfNode is an extension to the vnode and contains a pointer to the fsContext structure.
(dbx) set $bf=(struct bfNode *)&(((struct vnode *)$v)->v_data)(dbx) p $bf0xfffffc0002964eb8 (dbx) p *(struct bfNode *)$bf
Note the access and fileset context pointer for your file.
3-40 AdvFS In-Memory Structures
Solutions
(dbx) px $v
0xfffffc00020ec000
(dbx)
(dbx)
(dbx) p &(*(struct vnode *)$v).v_data
0xfffffc00020ec0c8 = "^H^;\256^E"
(dbx)
(dbx)
(dbx) set $bf=0xfffffc00020ec0c8
(dbx)
(dbx)
(dbx) p *(struct bfNode *)$bf
struct {
accessp = 0xfffffc0005aefb08
fsContextp = 0xfffffc00020ec0f0
tag = struct {
num = 23723
seq = 32771
}
bfSetId = struct {
domainId = struct {
tv_sec = 937059922
tv_usec = 653520
}
dirTag = struct {
num = 1
seq = 32769
}
}
}
(dbx)
(dbx)
(dbx) whatis struct bfNode
struct bfNode {
struct bfAccess * accessp;
struct fsContext * fsContextp;
bfTagT tag;
bfSetIdT bfSetId;
};
(dbx)
(dbx)
(dbx) set $fsc = 0xfffffc00020ec0f0
(dbx)
(dbx)
(dbx) p *(struct fsContext *)$fsc
struct {
initialized = 1
quotaInitialized = 1
undel_dir_tag = struct {
num = 0
seq = 0
}
fs_flag = 0
dirty_stats = 0
AdvFS In-Memory Structures 3-41
Solutions
dirty_alloc = 0
file_lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc00020ec118
prev = 0xfffffc00020ec118
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
dirstamp = 0
fsContext_mutex = struct {
mutex = 0
}
bf_tag = struct {
num = 23723
seq = 32771
}
last_offset = 0
dir_stats = struct {
st_ino = struct {
num = 23723
seq = 32771
}
st_mode = 33261
st_uid = 0
st_gid = 0
st_rdev = 0
st_size = 0
st_atime = 938696472
st_uatime = 976962000
st_mtime = 938696472
st_umtime = 976962000
st_ctime = 938696472
st_uctime = 976962000
st_flags = 0
dir_tag = struct {
num = 23707
seq = 32769
}
fragId = struct {
frag = 0
type = BF_FRAG_ANY
}
st_nlink = 1
st_unused_1 = 0
fragPageOffset = 0
st_unused_2 = 0
}
3-42 AdvFS In-Memory Structures
Solutions
fileSetNode = 0xfffffc0005ab5088
diskQuot = {
[0] 0xfffffc0005ac7088
[1] 0xfffffc0005ac7148
}
}
5. Verify that you are looking at the right file by matching the tag number, as shown by showfile, with the tag you see with dbx.
(dbx) sh showfile -x ob_1
Id Vol PgSz Pages XtntType Segs SegSz I/O Perf File
5cab.8003 1 16 0 simple ** ** async 100% ob_1
extentMap: 1
pageOff pageCnt vol volBlock blockCnt
extentCnt: 0
(dbx)
(dbx)
(dbx) sh ls -li ob_1
23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1
(dbx)
6. Obtain a pointer to the fileset context and print it.
(dbx) set $fc=(struct fsContext *)(((struct bfNode *)$bf)->fsContextp)(dbx) p $fc0xfffffc0002964ee0(dbx) p *(struct fsContext *)$fcstruct { initialized = 1....... dir_stats = struct { st_ino = struct { num = 13104 seq = 32790 }....... fileSetNode = 0xfffffc0003f26288 diskQuot = { [0] 0xfffffc00014bc988 [1] 0xfffffc00014bca08 }}
Note how information contained in several BMT records related to the file has been placed into one in-memory structure. Verify that the POSIX file statistics information seems reasonable. (Extracted from the solution for #4.)
(dbx) dir_stats = struct {
st_ino = struct {
num = 23723
seq = 32771
}
AdvFS In-Memory Structures 3-43
Solutions
st_mode = 33261
st_uid = 0
st_gid = 0
st_rdev = 0
st_size = 0
st_atime = 938696472
st_uatime = 976962000
st_mtime = 938696472
st_umtime = 976962000
st_ctime = 938696472
st_uctime = 976962000
st_flags = 0
dir_tag = struct {
num = 23707
(dbx)
(dbx)
(dbx) po 33261
0100755
(dbx)
(dbx)
(dbx)
(dbx) sh ls -li ob_1
23723 -rwxr-xr-x 1 root system 0 Sep 30 09:01 ob_1
(dbx)
(dbx)
(dbx) sh ls -lid /usr/bruden/advfs
23707 drwxr-xr-x 2 root system 8192 Sep 30 09:01 /usr/bruden/advfs
(dbx)
7. Print out the two disk quota structures.
(dbx) set $qu=(struct dQuot *)(((struct fsContext *)$fc)->diskQuot[0])(dbx) p *(struct dQuot *)$qu.......(dbx) set $qg=(struct dQuot *)(((struct fsContext *)$fc)->diskQuot[1])(dbx) p *(struct dQuot *)$qg
(dbx) whatis struct fsContext
struct fsContext {
short initialized;
short quotaInitialized;
bfTagT undel_dir_tag;
long fs_flag;
int dirty_stats;
int dirty_alloc;
lock_data_t file_lock;
long dirstamp;
mutexT fsContext_mutex;
bfTagT bf_tag;
long last_offset;
struct fs_stat {
bfTagT st_ino;
mode_t st_mode;
uid_t st_uid;
3-44 AdvFS In-Memory Structures
Solutions
gid_t st_gid;
dev_t st_rdev;
off_t st_size;
time_t st_atime;
int st_uatime;
time_t st_mtime;
int st_umtime;
time_t st_ctime;
int st_uctime;
uint_t st_flags;
bfTagT dir_tag;
bfFragIdT fragId;
u_short st_nlink;
short st_unused_1;
uint32T fragPageOffset;
uint32T st_unused_2;
} dir_stats;
struct fileSetNode * fileSetNode;
diskQuot[2] of struct dQuot *;
};
(dbx)
(dbx)
(dbx) whatis struct dQuot
struct dQuot {
dyn_hashlinks_w_keyT dq_links;
int dq_flags;
int dq_type;
int dq_cnt;
uint_t dq_id;
union {
struct dQBlk32 {
u_int dqb_bhardlimit;
u_int dqb_bsoftlimit;
u_int dqb_curblocks;
u_int dqb_ihardlimit;
u_int dqb_isoftlimit;
u_int dqb_curinodes;
time_t dqb_btime;
time_t dqb_itime;
} dq_dqb32;
struct dQBlk64 {
u_long dqb_bhardlimit;
u_long dqb_bsoftlimit;
u_long dqb_curblocks;
u_int dqb_ihardlimit;
u_int dqb_isoftlimit;
u_int dqb_curinodes;
u_int dqb_unused1;
time_t dqb_btime;
u_int dqb_unused2;
time_t dqb_itime;
u_int dqb_unused3;
u_long dqb_unused4;
} dq_dqb64;
AdvFS In-Memory Structures 3-45
Solutions
} dQ;
struct fileSetNode * fileSetNode;
lock_data_t dqLock;
};
(dbx)
(dbx)
(dbx) p *(*(struct fsContext *)$fsc).diskQuot[0]
struct {
dq_links = struct {
dh_links = struct {
dh_next = 0xfffffc0005ac7088
dh_prev = 0xfffffc0005ac7088
}
dh_key = 36028788429215309
}
dq_flags = 8
dq_type = 0
dq_cnt = 203
dq_id = 0
dQ = union {
dq_dqb32 = struct {
dqb_bhardlimit = 0
dqb_bsoftlimit = 0
dqb_curblocks = 0
dqb_ihardlimit = 0
dqb_isoftlimit = 203192
dqb_curinodes = 0
dqb_btime = 0
dqb_itime = 0
}
dq_dqb64 = struct {
dqb_bhardlimit = 0
dqb_bsoftlimit = 0
dqb_curblocks = 203192
dqb_ihardlimit = 0
dqb_isoftlimit = 0
dqb_curinodes = 7477
dqb_unused1 = 0
dqb_btime = 0
dqb_unused2 = 0
dqb_itime = 0
dqb_unused3 = 0
dqb_unused4 = 0
}
}
fileSetNode = 0xfffffc0005ab5088
dqLock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005ac7100
prev = 0xfffffc0005ac7100
}
l_caller = 3682780
l_wait_writers = 0
3-46 AdvFS In-Memory Structures
Solutions
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c93500
}
}
At this point we have seen the in-core, FAS-level structures for the file. Now look at the FAS-level structure for the file system or fileset. You could go directly from the fileset context structure, but take the scenic tour through the mount structure.
8. There is a pointer to the mount structure inside the vnode. Print the structure. Wake up when you see the few MSFS-specific fields.
(dbx) set $m=(struct mount *)(((struct vnode *)$v)->v_mount)(dbx) print $m(dbx) print *(struct mount *)$m
(dbx) p (*(struct vnode *)$v).v_mount
0xfffffc0005ab2a80
(dbx)
(dbx)
(dbx) set $m = 0xfffffc0005ab2a80
(dbx)
(dbx) p *(struct mount *)$m
struct {
m_lock = 18446739675758144512
m_flag = 20480
m_funnel = 0
m_nxt = 0xfffffc0005ab2d80
m_prev = 0xfffffc0005ab3380
m_op = 0xfffffc00006af990
m_vnodecovered = 0xfffffc0001987200
m_mounth = 0xfffffc00011c0fc0
m_vlist_lock = 0
m_exroot = 0
m_uid = 0
m_stat = struct {
f_type = 10
f_flags = 20480
f_fsize = 512
f_bsize = 8192
f_blocks = 1426112
f_bfree = 399740
f_bavail = 204992
f_files = 377869
f_ffree = 354165
f_fsid = struct {
val = {
[0] 3776054149
[1] 10
}
}
f_spare = {
AdvFS In-Memory Structures 3-47
Solutions
[0] 0
[1] 0
[2] 0
}
f_mntonname = 0xfffffc0000d34c90 = "/usr"
f_mntfromname = 0xfffffc0000d34940 = "usr_domain#usr"
mount_info = union {
ufs_args = struct {
fspec = 0x9f8d037da6652
exflags = 1
exroot = 0
}
nfs_args = struct {
addr = 0x9f8d037da6652
fh = 0x1
flags = 0
wsize = 0
rsize = 0
timeo = 0
retrans = 0
maxtimo = 0
hostname = (nil)
acregmin = 0
acregmax = 0
acdirmin = 0
acdirmax = 0
netname = (nil)
pathconf = (nil)
}
mfs_args = struct {
name = 0x9f8d037da6652
base = 0x1
size = 0
}
cdfs_args = struct {
fspec = 0x9f8d037da6652
exflags = 1
exroot = 0
flags = 0
version = 0
default_uid = 0
default_gid = 0
default_fmode = 0
default_dmode = 0
map_uid_ct = 0
map_uid = (nil)
map_gid_ct = 0
map_gid = (nil)
}
procfs_args = struct {
fspec = 0x9f8d037da6652
exflags = 1
exroot = 0
}
3-48 AdvFS In-Memory Structures
Solutions
msfs_args = struct {
id = struct {
id1 = 937059922
id2 = 653520
tag = 1
}
}
ffm_args = struct {
ffm_flags = 937059922
f_un = union {
ffm_pname = 0x1
ffm_fdesc = 1
}
}
}
}
m_info = 0xfffffc0005ab5088
m_nfs_errmsginfo = struct {
n_noexport = 0
last_noexport = 0
n_stalefh = 0
last_stalefh = 0
}
m_unmount_lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005ab2ba8
prev = 0xfffffc0005ab2ba8
}
l_caller = 4783020
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc00025ac900
}
}
9. The m_info field of the mount structure contains a pointer to AdvFS private file system information. This information is the fileSetNode. Print it.
(dbx) set $fsn=(struct fileSetNode *)(((struct mount *)$m)->m_info)(dbx) px $fsn0xfffffc0003f26288 (dbx) p *(struct fileSetNode *)$fsnstruct {........ domainId = struct { tv_sec = 865089685 tv_usec = 897520 }....... bfSetH = struct { setH = 4
AdvFS In-Memory Structures 3-49
Solutions
dmnH = 2 } root_vp = 0xfffffc0001451200....... fileSetStats = struct { msfs_lookup = 8721216.......
You will see different types interesting information: domain ID, bitfile-set handle, pointer to the file system’s root directory, and lots of statistics.
(dbx) p (*(struct mount *)$m).m_info
0xfffffc0005ab5088
(dbx)
(dbx)
(dbx) set $fsn = 0xfffffc0005ab5088
(dbx)
(dbx)
(dbx) p *(struct fileSetNode *)$fsn
struct {
fsNext = (nil)
fsPrev = 0xfffffc0005ab5348
rootTag = struct {
num = 2
seq = 32769
}
tagsTag = struct {
num = 3
seq = 32769
}
filesetMagic = 2918187013
dmnP = 0xfffffc0000f24008
rootAccessp = 0xfffffc0005af7688
bfSetId = struct {
domainId = struct {
tv_sec = 937059922
tv_usec = 653520
}
dirTag = struct {
num = 1
seq = 32769
}
}
bfSetp = 0xfffffc0005b7ca08
root_vp = 0xfffffc0005ac98c0
fsFlags = 0
mountp = 0xfffffc0005ab2a80
quotaStatus = 1421
blkHLimit = 0
blkSLimit = 0
fileHLimit = 0
fileSLimit = 0
blksUsed = 1026404
filesUsed = 23704
3-50 AdvFS In-Memory Structures
Solutions
blkTLimit = 0
fileTLimit = 0
filesetMutex = struct {
mutex = 0
}
qi = {
[0] struct {
qiAccessp = 0xfffffc0005af7208
qiContext = 0xfffffc0005ac9bf0
qiTag = struct {
num = 4
seq = 32769
}
qiBlkTime = 604800
qiFileTime = 604800
qiFlags = ’^@’
qiPgSz = 8192
qiFilePgs = 2
qiCred = 0xfffffc0005ac6e40
qiLock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005ab5178
prev = 0xfffffc0005ab5178
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
}
[1] struct {
qiAccessp = 0xfffffc0005af6d88
qiContext = 0xfffffc0005ac9e30
qiTag = struct {
num = 5
seq = 32769
}
qiBlkTime = 604800
qiFileTime = 604800
qiFlags = ’^@’
qiPgSz = 8192
qiFilePgs = 1
qiCred = 0xfffffc0005ac6fc0
qiLock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005ab51e0
prev = 0xfffffc0005ab51e0
}
l_caller = 0
AdvFS In-Memory Structures 3-51
Solutions
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
}
}
fileSetStats = struct {
msfs_lookup = 99608
lookup = struct {
hit = 31366
hit_not_found = 1087
miss = 67149
}
msfs_create = 7
msfs_mknod = 0
msfs_open = 0
msfs_close = 5543
msfs_access = 113974
msfs_getattr = 182729
msfs_setattr = 13
msfs_read = 9986
msfs_write = 161
msfs_mmap = 1456
msfs_fsync = 1
msfs_seek = 2981
msfs_remove = 3
msfs_link = 0
msfs_rename = 0
msfs_mkdir = 0
msfs_rmdir = 0
msfs_symlink = 0
msfs_readdir = 4282
msfs_readlink = 2389
msfs_inactive = 86090
msfs_reclaim = 57420
msfs_page_read = 0
msfs_page_write = 0
msfs_getpage = 6706
msfs_putpage = 12
msfs_bread = 0
msfs_brelse = 0
msfs_lockctl = 29
msfs_setvlocks = 8
msfs_syncdata = 0
}
}
10. Use showfdmn to verify you have a domain ID match. The significant FAS-level structures have now been studied.
(dbx) sh pwd
/usr/bruden/advfs
3-52 AdvFS In-Memory Structures
Solutions
(dbx)
(dbx)
(dbx) sh showfdmn usr_domain
Id Date Created LogPgs Version Domain Name
37da6652.0009f8d0 Sat Sep 11 10:25:22 1999 512 4 usr_domain
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name
1L 1426112 204960 86% on 256 256 /dev/disk/dsk1g
11. Print the access structure and see if its tag number matches your target.
(dbx) p *(struct bfAccess *)$bfastruct { fwd = 0xffffffff8077a3a8 bwd = 0xfffffc0000601c90....... tag = struct { num = 12572 seq = 32797 }.......
(dbx) p *(struct bfNode *)$bf
struct {
accessp = 0xfffffc0005aefb08
fsContextp = 0xfffffc00020ec0f0
tag = struct {
num = 23723
seq = 32771
}
bfSetId = struct {
domainId = struct {
tv_sec = 937059922
tv_usec = 653520
}
dirTag = struct {
num = 1
seq = 32769
}
}
}
(dbx)
(dbx)
(dbx) set $bfa = 0xfffffc0005aefb08
(dbx)
(dbx)
(dbx) p *(struct bfAccess *)$bfa
struct {
hashlinks = struct {
dh_links = struct {
dh_next = 0xfffffc0003659208
dh_prev = 0xfffffc000199b688
}
AdvFS In-Memory Structures 3-53
Solutions
dh_key = 2222450857
}
freeFwd = (nil)
freeBwd = (nil)
onFreeList = 0
accMagic = 2918187009
setFwd = 0xfffffc0003659688
setBwd = 0xfffffc00016a5b08
bfaLock = struct {
mutex = 0
}
accessCnt = 1
refCnt = 1
mmapCnt = 0
stateLk = struct {
hdr = struct {
lkType = LKT_STATE
nxtFtxLk = (nil)
mutex = 0xfffffc0005aefb48
lkUsage = LKU_BF_STATE
}
state = ACC_VALID
pendingState = LKW_NONE
waiters = 0
cv = 0
}
saved_stats = (nil)
bfVp = 0xfffffc00020ec000
bfObj = 0xfffffc00027f9380
bfIoLock = struct {
mutex = 0
}
dkResult = 0
miDkResult = 0
dirtyBufList = struct {
lsnFwd = (nil)
lsnBwd = (nil)
accFwd = 0xfffffc0005aefbb8
accBwd = 0xfffffc0005aefbb8
freeFwd = (nil)
freeBwd = (nil)
hashFwd = (nil)
hashBwd = (nil)
length = 0
touched = 0
ioOut = 0
lenLimit = 0
indexBuf = (nil)
}
cleanBufList = struct {
lsnFwd = (nil)
lsnBwd = (nil)
accFwd = 0xfffffc0005aefc10
accBwd = 0xfffffc0005aefc10
3-54 AdvFS In-Memory Structures
Solutions
freeFwd = (nil)
freeBwd = (nil)
hashFwd = (nil)
hashBwd = (nil)
length = 0
touched = 0
ioOut = 0
lenLimit = 0
indexBuf = (nil)
}
flushWait = 0
maxFlushWaiters = 0
hiFlushLsn = struct {
num = 0
}
hiWaitLsn = struct {
num = 0
}
nextFlushSeq = struct {
num = 2
}
flushWaiterQ = struct {
head = 0xfffffc0005aefc78
tail = 0xfffffc0005aefc78
cnt = 0
}
msyncWait = 0
msyncNum = 0
raHitPage = 0
raStartPage = 0
logWrite = (nil)
trunc_xfer_lk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefcb8
prev = 0xfffffc0005aefcb8
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
cow_lk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefce8
prev = 0xfffffc0005aefce8
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
AdvFS In-Memory Structures 3-55
Solutions
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
nextCloneAccp = (nil)
origAccp = (nil)
cowPgCount = 0
cloneId = 0
cloneCnt = 0
maxClonePgs = 0
dataSafety = BFD_NIL
noClone = 0
deleteWithClone = 0
outOfSyncClone = 0
trunc = 0
cloneAccHRefd = 0
fragState = FS_FRAG_NONE
fragId = struct {
frag = 0
type = BF_FRAG_ANY
}
fragPageOffset = 0
bfPageSz = 16
reqServices = 1
optServices = 0
tag = struct {
num = 23723
seq = 32771
}
bfState = BSRA_VALID
transitionId = 30271
file_size = 0
bfSetp = 0xfffffc0005b7ca08
dmnP = 0xfffffc0000f24008
mcellList_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0005aefb48
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefdb8
prev = 0xfffffc0005aefdb8
}
l_caller = 3296636
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
3-56 AdvFS In-Memory Structures
Solutions
l_lastlocker = 0xfffffc0001f83800
}
cv = 0
}
xtntMap_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0005aefb48
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefe10
prev = 0xfffffc0005aefe10
}
l_caller = 3415076
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001f83800
}
cv = 0
}
mapped = 1
nextPage = 0
extendSize = 0
primMCId = struct {
cell = 1
page = 985
}
primVdIndex = 1
xtnts = struct {
validFlag = 1
xtntMap = 0xfffffc0005a8f0e8
shadowXtntMap = (nil)
stripeXtntMap = (nil)
copyXtntMap = (nil)
migTruncLk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefe88
prev = 0xfffffc0005aefe88
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
AdvFS In-Memory Structures 3-57
Solutions
}
type = BSXMT_APPEND
allocPageCnt = 0
}
dirTruncp = (nil)
putpage_lk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005aefec8
prev = 0xfffffc0005aefec8
}
l_caller = 3347236
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c92f00
}
real_bfap = (nil)
idx_params = (nil)
idxQuotaBlks = 0
largest_pl_num = 0
actRangeLock = struct {
mutex = 0
}
actRangeList = struct {
arFwd = 0xfffffc0005aeff10
arBwd = 0xfffffc0005aeff10
arCount = 0
arMaxLen = 0
arDioCount = 0
arDioWaiters = 0
}
}
(dbx)
12. Examine the back pointers to the vnode and VM object. Use these fields for additional confirmation that you have the right target. You will also see pointers to extent map information and the bitfile-set and domain structures.
(dbx) p *(struct bfAccess *)$batstruct {..... bfah = 452 vp = 0xfffffc00018e8800 obj = 0xfffffc00023d3860....... bfSetp = 0xfffffc0001240e08....... domainp = 0xfffffc000123e008....... xtnts = struct { validFlag = 1
3-58 AdvFS In-Memory Structures
Solutions
xtntMap = 0xfffffc0003b328a8 shadowXtntMap = (nil) stripeXtntMap = (nil)
13. Print the extent map.
For most bitfiles, this is not a very exciting structure; however, for bitfiles with many extents, this is the beginning of a mass of pointers. Note that the subXtntMap field is an array with validCnt elements. Each subXtntMap structure has an array of extents (bsXA) with cnt elements.
(dbx) p *(*(struct bfAccess *)$bfa).xtnts.xtntMap
struct {
nextXtntMap = (nil)
domain = 0xfffffc0000f24008
hdrType = 1
hdrVdIndex = 1
hdrMcellId = struct {
cell = 1
page = 985
}
blksPerPage = 16
nextValidPage = 0
allocDeallocPageCnt = 0
allocVdIndex = 65535
origStart = 0
origEnd = 0
updateStart = 1
updateEnd = 0
validCnt = 1
cnt = 1
maxCnt = 1
subXtntMap = 0xfffffc0001346f88
}
Now proceed to the bitfile-set, domain, and virtual disk data structures of AdvFS. Use the pointers of the bitfile access structure to find them.
14. Move from the bitfile access structure into the bitfile-set structure. Print it. There is a lot to see. Be sure to use the bitfile-set ID field to verify and match the values returned by showfsets.
(dbx) set $bfs=(bfSetT *)(((struct bfAccess *)$bfa)->bfSetp)(dbx) p $bfs(dbx) p *(bfSetT *)$bfsstruct { bfSetId = struct { domainId = struct { tv_sec = 864927707 tv_usec = 860832 } dirTag = struct { num = 1 seq = 32769 }
AdvFS In-Memory Structures 3-59
Solutions
}....... dmnp = 0xfffffc000123e008
(dbx) p (*(struct bfAccess *)$bfa).bfSetp
0xfffffc0005b7ca08
(dbx)
(dbx)
(dbx) set $bfs = 0xfffffc0005b7ca08
(dbx)
(dbx)
(dbx)
(dbx) whatis bfSetT
typedef struct bfSet {
dyn_hashlinks_w_keyT hashlinks;
bfSetName[32] of char ;
bfSetIdT bfSetId;
uint_t bfSetMagic;
int refCnt;
int logicalRefCnt;
domainT * dmnP;
bfsQueueT bfSetList;
mutexT accessChainLock;
bfAccessT * accessFwd;
bfAccessT * accessBwd;
dev_t dev;
bfTagT dirTag;
bfAccessT * dirBfAp;
bfSetT * cloneSetp;
bfSetT * origSetp;
uint32T cloneId;
uint32T cloneCnt;
uint32T numClones;
uint32T outOfSync;
mutexT cloneDelStateMutex;
stateLkT cloneDelState;
int xferThreads;
uint32T infoLoaded;
uint32T cachepolicy;
mutexT dirMutex;
ftxLkT dirLock;
bfsStateT state;
int bfCnt;
unsigned long tagFrLst;
unsigned long tagUnInPg;
unsigned long tagUnMpPg;
ftxLkT fragLock;
bfTagT fragBfTag;
bfAccessT * fragBfAp;
uint32T freeFragGrps;
uint32T truncating;
fragGrps[8] of fragGrpT ;
void * fsnp;
} bfSetT;
(dbx)
3-60 AdvFS In-Memory Structures
Solutions
(dbx)
(dbx) p *(bfSetT *)$bfs
struct {
hashlinks = struct {
dh_links = struct {
dh_next = 0xfffffc0005b7ca08
dh_prev = 0xfffffc0005b7ca08
}
dh_key = 937059923
}
bfSetName = "usr"
bfSetId = struct {
domainId = struct {
tv_sec = 937059922
tv_usec = 653520
}
dirTag = struct {
num = 1
seq = 32769
}
}
bfSetMagic = 2918187010
refCnt = 1
logicalRefCnt = 1
dmnP = 0xfffffc0000f24008
bfSetList = struct {
bfsQfwd = 0xfffffc0005b7c7e8
bfsQbck = 0xfffffc0005b7cce8
}
accessChainLock = struct {
mutex = 0
}
accessFwd = 0xfffffc0005ae4d88
accessBwd = 0xfffffc0005af7b08
dev = -518913147
dirTag = struct {
num = 1
seq = 32769
}
dirBfAp = 0xfffffc0005af8008
cloneSetp = (nil)
origSetp = (nil)
cloneId = 0
cloneCnt = 0
numClones = 0
outOfSync = 0
cloneDelStateMutex = struct {
mutex = 0
}
cloneDelState = struct {
hdr = struct {
lkType = LKT_STATE
nxtFtxLk = (nil)
mutex = 0xfffffc0005b7cac8
AdvFS In-Memory Structures 3-61
Solutions
lkUsage = LKU_CLONE_DEL
}
state = CLONE_DEL_NORMAL
pendingState = LKW_NONE
waiters = 0
cv = 0
}
xferThreads = 0
infoLoaded = 1
cachepolicy = 0
dirMutex = struct {
mutex = 0
}
dirLock = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0005b7cb10
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005b7cb40
prev = 0xfffffc0005b7cb40
}
l_caller = 3630684
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c92f00
}
cv = 0
}
state = BFS_READY
bfCnt = 23723
tagFrLst = 24
tagUnInPg = 24
tagUnMpPg = 24
fragLock = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0005b7cb10
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0005b7cbb8
prev = 0xfffffc0005b7cbb8
}
3-62 AdvFS In-Memory Structures
Solutions
l_caller = 3630684
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c92600
}
cv = 0
}
fragBfTag = struct {
num = 1
seq = 32769
}
fragBfAp = 0xfffffc0005af7b08
freeFragGrps = 1
truncating = 0
fragGrps = {
[0] struct {
firstFreeGrp = 7280
lastFreeGrp = 32
}
[1] struct {
firstFreeGrp = 7216
lastFreeGrp = 5024
}
[2] struct {
firstFreeGrp = 7168
lastFreeGrp = 7168
}
[3] struct {
firstFreeGrp = 7248
lastFreeGrp = 7248
}
[4] struct {
firstFreeGrp = 7232
lastFreeGrp = 7232
}
[5] struct {
firstFreeGrp = 7120
lastFreeGrp = 7120
}
[6] struct {
firstFreeGrp = 7056
lastFreeGrp = 7056
}
[7] struct {
firstFreeGrp = 7264
lastFreeGrp = 7264
}
}
fsnp = 0xfffffc0005ab5088
}
(dbx)
AdvFS In-Memory Structures 3-63
Solutions
(dbx)
(dbx) sh showfsets usr_domain
usr
Id : 37da6652.0009f8d0.1.8001
Files : 23704, SLim= 0, HLim= 0
Blocks (512) : 1026436, SLim= 0, HLim= 0
Quota Status : user=off group=off
var
Id : 37da6652.0009f8d0.2.8001
Files : 970, SLim= 0, HLim= 0
Blocks (512) : 165004, SLim= 0, HLim= 0
Quota Status : user=off group=off
15. Now print the domain structure. (Do not use struct domain unless you want the structure for socket domains.) In the middle of this structure, you will see an array for pointers to virtual disk structures. There are also many fields used to control file domain I/O.
(dbx) set $d=(domainT *)(((bfSetT *)$bfs)->dmnp)(dbx) p $d0xfffffc000123e008 (dbx) p *(domainT *)$dstruct {....... domainName = "usr_domain"....... vdpTbl = { [0] 0xfffffc0003a18388.......
(dbx) p (*(bfSetT *)$bfs).dmnp
0xfffffc0000f24008
(dbx)
(dbx)
(dbx) set $d = 0xfffffc0000f24008
(dbx)
(dbx)
(dbx) p *(domainT *)$d
struct {
mutex = struct {
mutex = 0
}
dmnMagic = 2918187011
dmnFwd = 0xfffffc0000f24008
dmnBwd = 0xfffffc0000f24008
dmnHashlinks = struct {
dh_links = struct {
dh_next = 0xfffffc0000f24008
dh_prev = 0xfffffc0000f24008
}
dh_key = 937059922
}
dmnVersion = 4
state = BFD_ACTIVATED
3-64 AdvFS In-Memory Structures
Solutions
domainId = struct {
tv_sec = 937059922
tv_usec = 653520
}
dualMountId = struct {
tv_sec = 0
tv_usec = 0
}
bfDmnMntId = struct {
tv_sec = 938694756
tv_usec = 891025
}
dmnAccCnt = 4
dmnRefWaiters = 0
activateCnt = 2
mountCnt = 2
bfSetDirp = 0xfffffc0005b7c788
bfSetDirTag = struct {
num = 4294967288
seq = 0
}
BfSetTblLock = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0000f24008
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f240a8
prev = 0xfffffc0000f240a8
}
l_caller = 3271972
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc000320bb00
}
cv = 0
}
bfSetHead = struct {
bfsQfwd = 0xfffffc0005b7cce8
bfsQbck = 0xfffffc0005b7c7e8
}
bfSetDirAccp = 0xfffffc0005af8488
ftxLogTag = struct {
num = 4294967287
seq = 0
}
ftxLogP = 0xfffffc0005baec48
AdvFS In-Memory Structures 3-65
Solutions
ftxLogPgs = 512
logAccessp = 0xfffffc0005af8908
ftxTbld = struct {
rrNextSlot = 13
rrSlots = 30
ftxWaiters = 0
trimWaiters = 0
excWaiters = 0
slotCv = 0
trimCv = 0
excCv = 0
logTrimLsn = struct {
num = 0
}
nextNewSlot = 30
oldestFtxLa = struct {
read = 782
update = 783
lgra = {
[0] struct {
page = 184
offset = 1790
lsn = struct {
num = 2821672
}
}
[1] struct {
page = 185
offset = 656
lsn = struct {
num = 0
}
}
}
}
lastFtxId = 68203
slotUseCnt = 0
noTrimCnt = 0
tablep = 0xfffffc0001360808
oldestSlot = 13
totRoots = 68203
}
pinBlockBuf = (nil)
domainName = "usr_domain"
majorNum = 2055
flag = BFD_NORMAL
lsnLock = struct {
mutex = 0
}
lsnList = struct {
lsnFwd = 0xfffffe0407605e70
lsnBwd = 0xfffffe0407605e70
accFwd = (nil)
accBwd = (nil)
3-66 AdvFS In-Memory Structures
Solutions
freeFwd = (nil)
freeBwd = (nil)
hashFwd = (nil)
hashBwd = (nil)
length = 1
touched = 0
ioOut = 0
lenLimit = 0
indexBuf = (nil)
}
writeToLsn = struct {
num = 0
}
pinBlockWait = 0
pinBlockCv = 0
pinBlockRunning = 0
contBits = 0
dirtyBufLa = struct {
read = 266
update = 266
lgra = {
[0] struct {
page = 185
offset = 656
lsn = struct {
num = 2821708
}
}
[1] struct {
page = 185
offset = 603
lsn = struct {
num = 2821706
}
}
}
}
scLock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f24308
prev = 0xfffffc0000f24308
}
l_caller = 3557160
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c93500
}
scTbl = 0xfffffc00035fe008
vdpTblLock = struct {
mutex = 0
AdvFS In-Memory Structures 3-67
Solutions
}
vdCnt = 1
vdpTbl = {
[0] 0xfffffc0000f2b508
[1] (nil)
[2] (nil)
(…)
[255] (nil)
}
rmvolTruncLk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f24b50
prev = 0xfffffc0000f24b50
}
l_caller = 3777480
l_wait_writers = 0
l_readers = 0
l_flags = '^@'
l_lifms = '\200'
l_info = 0
l_lastlocker = 0xfffffc00022f4300
}
bcStat = struct {
pinHit = 13744
pinHitWait = 207
pinRead = 0
refHit = 319958
refHitWait = 128
raBuf = 3376
ubcHit = 2107
unpinCnt = struct {
lazy = 13391
blocking = 56
clean = 10
log = 908
}
derefCnt = 330259
devRead = 9034
devWrite = 1790
unconsolidate = 0
consolAbort = 0
unpinFileType = struct {
meta = 10673
ftx = 12051
}
derefFileType = struct {
meta = 137070
ftx = 318466
}
}
bmtStat = struct {
fStatRead = 0
fStatWrite = 67009
3-68 AdvFS In-Memory Structures
Solutions
resv1 = 0
resv2 = 0
bmtRecRead = {
[0] 0
[1] 0
[2] 0
[3] 0
[4] 0
[5] 0
[6] 0
[7] 0
[8] 0
[9] 0
[10] 0
[11] 0
[12] 0
[13] 0
[14] 0
[15] 0
[16] 0
[17] 0
[18] 0
[19] 0
[20] 0
[21] 0
}
bmtRecWrite = {
[0] 0
[1] 0
[2] 81
[3] 0
[4] 0
[5] 0
[6] 0
[7] 0
[8] 33
[9] 0
[10] 0
[11] 0
[12] 0
[13] 0
[14] 0
[15] 1
[16] 84
[17] 2
[18] 23
[19] 0
[20] 0
[21] 0
}
}
logStat = struct {
logWrites = 382
transactions = 12731
AdvFS In-Memory Structures 3-69
Solutions
segmentedRecs = 2
logTrims = 0
wastedWords = 35019
maxLogPgs = 59
minLogPgs = 0
maxFtxWords = 317
maxFtxAgent = 65
maxFtxTblSlots = 29
oldFtxTblAgent = 0
excSlotWaits = 0
fullSlotWaits = 2
rsv1 = 0
rsv2 = 0
rsv3 = 0
rsv4 = 0
}
totalBlks = 1426112
freeBlks = 204928
dmn_panic = 0
xidRecovery = struct {
head = (nil)
tail = (nil)
current_free_slot = 0
timestamp = struct {
tv_sec = 0
tv_usec = 0
}
}
xidRecoveryLk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f24e08
prev = 0xfffffc0000f24e08
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
smsync_policy = 0
metaPagep = 0xfffffe0400299008
fs_full_time = 0
}
(dbx)
16. The last major structure to print is used for virtual disks. You will see even more I/O control substructures here.
(dbx) set $vd=(struct vd *)(((domainT *)$d)->vdpTbl[0])(dbx) p $vd0xfffffc0003a18388 (dbx) p *(struct vd *)$vd
3-70 AdvFS In-Memory Structures
Solutions
struct {....... vdName = "/dev/disk/dsk2g"........ freeStgLst = 0xfffffc0003fcf188
(dbx) p (*(domainT *)$d).vdpTbl[0]
0xfffffc0000f2b508
(dbx)
(dbx)
(dbx) set $vd = 0xfffffc0000f2b508
(dbx)
(dbx)
(dbx) p *(struct vd *)$vd
struct {
stgCluster = 16
devVp = 0xfffffc0001a8c6c0
vdMagic = 2918187012
rbmtp = 0xfffffc0005af9688
bmtp = 0xfffffc0005af9208
sbmp = 0xfffffc0005af8d88
dmnP = 0xfffffc0000f24008
vdIndex = 1
maxPgSz = 16
bmtXtntPgs = 128
vdName = "/dev/disk/dsk1g"
vdState = BSR_VD_MOUNTED
vdSetupThd = (nil)
vdRefCnt = 0
vdRefWaiters = 0
vdStateLock = struct {
mutex = 0
}
vdSize = 1426112
vdSectorSize = 512
vdClusters = 89132
serviceClass = 1
mcell_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0000f24008
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f2b9a0
prev = 0xfffffc0000f2b9a0
}
l_caller = 3630684
l_wait_writers = 0
l_readers = 0
AdvFS In-Memory Structures 3-71
Solutions
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc00022f4600
}
cv = 0
}
nextMcellPg = 985
rbmt_mcell_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0000f24008
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f2ba00
prev = 0xfffffc0000f2ba00
}
l_caller = 0
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = (nil)
}
cv = 0
}
lastRbmtPg = 0
rbmtFlags = 0
stgMap_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0000f24008
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f2ba60
prev = 0xfffffc0000f2ba60
}
l_caller = 3575624
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc0001c93500
}
3-72 AdvFS In-Memory Structures
Solutions
cv = 0
}
freeStgLst = 0xfffffc0001a8e928
numFreeDesc = 50
freeClust = 12807
scanStartClust = 34044
bitMapPgs = 2
spaceReturned = 1
fill1 = (nil)
fill3 = (nil)
fill4 = 0
del_list_lk = struct {
hdr = struct {
lkType = LKT_FTX
nxtFtxLk = (nil)
mutex = 0xfffffc0000f24008
lkUsage = LKU_UNKNOWN
}
lock = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f2baf0
prev = 0xfffffc0000f2baf0
}
l_caller = 3630684
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc00022f4600
}
cv = 0
}
ddlActiveLk = struct {
l_lock = 0
l_head = struct {
next = 0xfffffc0000f2bb28
prev = 0xfffffc0000f2bb28
}
l_caller = 3365256
l_wait_writers = 0
l_readers = 0
l_flags = ’^@’
l_lifms = ’\200’
l_info = 0
l_lastlocker = 0xfffffc00022f4600
}
ddlActiveWaitMCId = struct {
cell = 0
page = 0
}
ddlActiveWaitCv = 0
dStat = struct {
AdvFS In-Memory Structures 3-73
Solutions
nread = 9034
nwrite = 1806
readblk = 194256
writeblk = 43648
seekCnt = 0
rglobBuf = 3543
rglobBlk = 56688
rglob = 436
wglobBuf = 1292
wglobBlk = 20672
wglob = 370
blockingQ = 0
waitLazyQ = 3
readyLazyQ = 0
consolQ = 0
devQ = 0
}
vdIoLock = struct {
mutex = 0
}
blockingQ = struct {
fwd = 0xfffffc0000f2bbe0
bwd = 0xfffffc0000f2bbe0
length = 0
lenLimit = 0
}
waitLazyQ = struct {
fwd = 0xfffffe0407605fa0
bwd = 0xfffffe0407605fa0
length = 1
lenLimit = 0
}
smSyncQ = {
[0] struct {
fwd = 0xfffffc0000f2bc10
bwd = 0xfffffc0000f2bc10
length = 0
lenLimit = 0
}
[1] struct {
fwd = 0xfffffc0000f2bc28
bwd = 0xfffffc0000f2bc28
length = 0
lenLimit = 0
}
(…)
[15] struct {
fwd = 0xfffffc0000f2bd78
bwd = 0xfffffc0000f2bd78
length = 0
lenLimit = 0
}
}
readyLazyQ = struct {
3-74 AdvFS In-Memory Structures
Solutions
fwd = 0xfffffc0000f2bd90
bwd = 0xfffffc0000f2bd90
length = 0
lenLimit = 1024
}
consolQ = struct {
fwd = 0xfffffc0000f2bda8
bwd = 0xfffffc0000f2bda8
length = 0
lenLimit = 0
}
devQ = struct {
fwd = 0xfffffc0000f2bdc0
bwd = 0xfffffc0000f2bdc0
length = 0
lenLimit = 0
}
blockingCnt = 0
blockingFact = 4
rdmaxio = 256
wrmaxio = 256
vdIoOut = 0
start_active = 0
gen_active = 0
active = struct {
hdr = struct {
lkType = LKT_STATE
nxtFtxLk = (nil)
mutex = 0xfffffc0000f2bbd8
lkUsage = LKU_VD_ACTIVE
}
state = INACTIVE_DISK
pendingState = LKW_NONE
waiters = 0
cv = 0
}
advfs_start_more_posted = 0
blkQ_cnt = 12899
lazyQ_cnt = 13413
smsyncQ_cnt = 3706
readyQ_cnt = 2027
consolQ_cnt = 1976
devQ_cnt = 1970
rmioq_cnt = 11441
rmormvq_cnt = 1
syncQIndx = 12
consolidate = 1
max_iosize_rd = 1048576
max_iosize_wr = 1048576
preferred_iosize_rd = 131072
preferred_iosize_wr = 131072
qtodev = 3
freeRsvdStg = struct {
start_clust = 1545
AdvFS In-Memory Structures 3-75
Solutions
num_clust = 4370
prevp = 0xfffffc0000f2be90
nextp = 0xfffffc0000f2be90
}
}
17. For the finale, print the data structure of the free storage cache.
(dbx) set $fst=(stgDescT *)(((struct vd *)$vd)->freeStgLst)(dbx) p $fst0xfffffc0003fcf188 (dbx) p *(stgDescT *)$fststruct { start_clust = 705832 num_clust = 14360 prevp = 0xfffffc0003fcf188 nextp = 0xfffffc0003fcf188}
(dbx) p (*(struct vd *)$vd).freeStgLst
0xfffffc0001a8e928
(dbx)
(dbx)
(dbx) set $fst = 0xfffffc0001a8e928
(dbx)
(dbx)
(dbx) p *(stgDescT *)$fst
struct {
start_clust = 6137
num_clust = 3745
prevp = 0xfffffc0001a8ef48
nextp = 0xfffffc0001a8e948
}
3-76 AdvFS In-Memory Structures
4
AdvFS System Calls and Kernel Interfaces
AdvFS System Calls and Kernel Interfaces 4-1
About This Chapter
About This Chapter
IntroductionThis chapter presents information on entries into AdvFS. Specifically:
• Virtual file system (VFS) switch table
• The vnode switch table
• Unified Buffer Cache (UBC) interface
• Device driver callback
• Lightweight context (LWC) interface
• I/O completion function
• True AdvFS system call
• Types of AdvFS system calls
• Domains and volumes
• Filesets
• Miscellaneous operations
• Algorithms for startup and recovery
• Storage management algorithms
• Cloning algorithms
• File migration and deletion algorithms
• Algorithms for threads
4-2 AdvFS System Calls and Kernel Interfaces
About This Chapter
ObjectivesTo describe the entries into AdvFS, you should be able to:
• List the various entry points to AdvFS.
• Determine how an AdvFS system call is processed.
• Describe the algorithms for startup and recovery.
• Explain the storage management algorithms.
• Define the cloning algorithms.
• Describe the file migration and deletion algorithms.
• Describe the algorithms for threads.
ResourcesFor more information on topics in this chapter, see the following sources:
• /usr/include/sys/mount.h
• /usr/include/sys/vnode.h
• msfs/osf/msfs_vfsops.c
• msfs/osf/msfs_vnops.c
• msfs/osf/msfs_misc.c
• msfs/osf/msfs_io.c
• msfs/bs/bs_qio.c
• msfs/bs/bs_misc.c
• kernel/msfs/msfs/msfs_syscalls.h
AdvFS System Calls and Kernel Interfaces 4-3
Describing Entries to AdvFS
Describing Entries to AdvFS
OverviewThis section describes the interaction between AdvFS and the kernel. The material provides some familiarity with routines appearing in crashes and live debugging.
• VFS switch table
• File-related system calls (VFS and vnode switch table)
• Unified buffer cache interface
• Device driver interface routines
• AdvFS system calls
VFS Switch TableThe VFS switch table defines a series of pointers to functions implementing many of the file system level activities. The VFS switch table has:
• Thirteen entry points for file system operations (includes V5 smooth synchronization)
• An interface defined in:
— /usr/include/sys/mount.h
— struct vfsops * m_op;
• The interface is implemented in:
— msfs/osf/msfs_vfsops.c
The following excerpt from msfs_vfsops.c shows the 13 VFS switch table routine names for AdvFS (MSFS) activities.
Example 4-1: VFS Switch Table Routine List
/* * msfs_vfsops * * Defines function pointers to AdvFS specific VFS fs operations. */struct vfsops msfs_vfsops = { msfs_mount, msfs_start, msfs_unmount, msfs_root, advfs_quotactl, msfs_statfs, msfs_sync, msfs_fhtovp,
4-4 AdvFS System Calls and Kernel Interfaces
Describing Entries to AdvFS
msfs_vptofh, msfs_init, msfs_mountroot, msfs_noop, msfs_smoothsync,};
vnode Switch TableThe vnode switch table defines a series of pointers to functions implementing most of the file-oriented AdvFS activities. The vnode switch table has:
• Forty two entry points for file operations.
• An interface defined in:
— /usr/include/sys/vnode.h
— struct vnodeops * v_op;
• The interface is implemented in msfs/osf/msfs_vnops.c
The example lists the 42 entry points for vnode operations involving AdvFS.
Example 4-2: File (vnode) Operations
/* * msfs_vnodeops * * Defines function pointers to AdvFS specific VFS vnode operations. */struct vnodeops msfs_vnodeops = { msfs_lookup, /* lookup */ msfs_create, /* create */ msfs_mknod, /* mknod */ msfs_open, /* open */ msfs_close, /* close */ msfs_access, /* access */ msfs_getattr, /* getattr */ msfs_setattr, /* setattr */ msfs_read, /* read */ msfs_write, /* write */ msfs_ioctl, /* ioctl */ seltrue, /* select */ msfs_mmap, /* mmap */ msfs_fsync, /* fsync */ msfs_seek, /* seek */ msfs_remove, /* remove */ msfs_link, /* link */ msfs_rename, /* rename */ msfs_mkdir, /* mkdir */ msfs_rmdir, /* rmdir */ msfs_symlink, /* symlink */ msfs_readdir, /* readdir */ msfs_readlink, /* readlink */ msfs_abortop, /* abortop */
AdvFS System Calls and Kernel Interfaces 4-5
Describing Entries to AdvFS
msfs_inactive, /* inactive */ msfs_reclaim, /* reclaim */ msfs_bmap, /* bmap */ msfs_strategy, /* strategy */ msfs_print, /* print */ msfs_page_read, /* page_read */ msfs_page_write, /* page_write */ msfs_swap, /* swap handler */ msfs_bread, /* buffer read */ msfs_brelse, /* buffer release */ msfs_lockctl, /* file locking */ msfs_syncdata, /* fsync byte range */ msfs_noop, /* Lock a node */ msfs_noop, /* Unlock a node */ msfs_getproplist, /* Get extended attributes */ msfs_setproplist, /* Set extended attributes */ msfs_delproplist, /* Delete extended attributes */ msfs_pathconf, /* pathconf */};
UBC InterfaceUltimately, data read from an AdvFS file system ends up in a memory location that is associated with the unified buffer cache (UBC). The UBC interface consists of:
• vnode operations used in paging.
• msfs_getpage to obtain a page from disk.
• msfs_putpage to write a page to the disk.
The implementation is in msfs/osf/msfs_misc.c.
Device Driver Interface RoutinesEventually an AdvFS I/O must cause device activity to begin. The device activity is triggered by device driver routines. The device driver interface consists of:
• In AdvFS struct buf:
— b_iodone field contains address of msfs_iodone()
— A buffer of data is representated
— Listhead is bsBufList
• At interrupt: device driver calls msfs_iodone()
4-6 AdvFS System Calls and Kernel Interfaces
Describing Entries to AdvFS
s
one of:
• msfs_iodone()
— Temporarily raises system priority level.
— Places buffer on MsfsIodoneBuf queue (holds completed I/O operationfor AdvFS) found within the processor structure.
— Posts LWC_PRI_MSFS_UBC.
The implementation is in msfs/osf/msfs_io.c.
AdvFS Lightweight Context InterfaceAfter the high system priority level processing, the remaining buffer work is dthrough the lightweight context (LWC) interface. The LWC interface consists
• Priority: LWC_PRI_MSFS_UBC
• Entry: msfs_async_iodone_lwc()
• msfs_async_iodone_lwc()
— Removes buffer from MsfsIodoneBuf
— Calls bs_osf_complete()
The implementation is in msfs/osf/msfs_io.c.
AdvFS I/O Completion FunctionThe I/O completion function:
• Checks for many errors
— If appropriate, prints error messages
— If error while writing to log, panic kernel
• Call bs_io_complete() to reach BAS layer
• Initiates more I/O if appropriate
Source location is msfs/bs/bs_qio.c.
AdvFS System Calls and Kernel Interfaces 4-7
Describing Entries to AdvFS
True AdvFS System CallThe true AdvFS system calls consist of:
• msfs_real_syscall()
— Single call, many flavors
— Called through MsfsSyscallp (filled in when AdvFS is started) with thelower 32 bits of the KSEG address of msfs_real_syscall()
— MsfsSyscallp + 0xfffffc0000000000 = &msfs_real_syscall()
• First argument is operation type (used in large case statement to determine the action)
Source location: msfs/bs/bs_misc.c.
The following code shows the argument list for msfs_real_syscall ().
Example 4-3: Prototype for msfs_real_syscall()
intmsfs_real_syscall( opTypeT opType, /* in - msfs operation to be performed */ libParamsT *parmBuf, /* in - ptr to op-specific parameters buffer;*/ /* contents are modified. */ int parmBufLen /* in - byte length of parmBuf */ ){
Types of AdvFS System CallsThere are 60 operation types of AdvFS system calls. The user interface (library wrappers for system calls) are:
• Compiled into /usr/shlib/libmsfs.so.
• Included from msfs_syscalls.h.
The source is located at kernel/msfs/msfs/msfs_syscalls.h .
The following excerpt from msfs_syscalls.h lists the operation types that could be in the first argument to msfs_real_syscall ().
Example 4-4: Operation Types within msfs_real_syscall()
typedef enum { OP_NONE, OP_GET_BF_PARAMS, OP_SET_BF_ATTRIBUTES, OP_GET_BF_XTNT_MAP, OP_ADD_STG, OP_ADD_OVER_STG, OP_MIGRATE,
4-8 AdvFS System Calls and Kernel Interfaces
Describing Entries to AdvFS
OP_DMN_INIT, OP_GET_DMNNAME_PARAMS, OP_GET_DMN_PARAMS, OP_SET_DMN_PARAMS, OP_GET_DMN_VOL_LIST, OP_GET_VOL_PARAMS, OP_SET_VOL_IOQ_PARAMS, OP_DUMP_LOCKS, OP_TRACE, OP_FSET_CREATE, OP_FSET_DELETE, OP_FSET_CLONE, OP_FSET_GET_INFO, OP_FSET_GET_ID, OP_GET_BFSET_PARAMS, OP_SET_BFSET_PARAMS, OP_ADD_VOLUME, OP_CRASH, OP_MSS_RESV1,
(...) OP_MSS_RESV17, OP_UNDEL_ATTACH, OP_UNDEL_DETACH, OP_UNDEL_GET, OP_GET_NAME, OP_REM_STG, OP_EVENT, OP_TAG_STAT, OP_SWITCH_LOG, OP_GET_BF_IATTRIBUTES, OP_SET_BF_IATTRIBUTES, OP_MOVE_BF_METADATA, OP_GET_VOL_BF_DESCS, OP_REM_VOLUME, OP_ADD_REM_VOL_SVC_CLASS, OP_SWITCH_ROOT_TAGDIR, OP_SET_BF_NEXT_ALLOC_VOL, OP_DISK_ERROR, OP_FTX_PROF, OP_REWRITE_XTNT_MAP, OP_RESET_FREE_SPACE_CACHE, OP_SET_NEXT_TAG, OP_REM_NAME, OP_REM_BF, OP_FSET_RENAME, OP_GET_LOCK_STATS, OP_FSET_GET_STATS, OP_GET_BKUP_XTNT_MAP, OP_GET_VOL_PARAMS2, OP_GET_GLOBAL_STATS, OP_GET_SMSYNC_STATS, OP_GET_IDX_BF_PARAMS, OP_ADD_REM_VOL_DONE} opIndexT;
AdvFS System Calls and Kernel Interfaces 4-9
Describing Entries to AdvFS
Domains and VolumesThis section specifies some of the central routines associated with some common AdvFS domain and volume commands. Calls related to domains and volumes consist of a utility:
• msfs_dmn_init() mkfdmn
The following example shows the argument list for msfs_dmn_init().
Example 4-5: Prototype for msfs_dmn_init()
mlStatusTmsfs_dmn_init( char* domain, /* in - bf domain name */ int maxVols, /* in - maximum number of virtual disks */ u32T logPgs, /* in - number of pages in log */ mlServiceClassT logSvc, /* in - log service attributes */ mlServiceClassT tagSvc, /* in - tag directory service attributes */ char *volName, /* in - block special device name */ mlServiceClassT volSvc, /* in - service class */ u32T volSize, /* in - size of the virtual disk */ u32T bmtXtntPgs, /* in - number of pages per BMT extent */ u32T bmtPreallocPgs, /* in - number of pages to be preallocated for the BMT */ u32T domainVersion, /* in - on-disk version of domain */ mlBfDomainIdT* bfDomainId /* out - domain id */ );
• msfs_add_volume() addvol
The following example shows the argument list for msfs_add_volume().
Example 4-6: Prototype for msfs_add_volume()
mlStatusTmsfs_add_volume( char *domain, /* in - domain name */ char *volName, /* in - block special device name */ mlServiceClassT *volSvc, /* in/out - service class */ u32T volSize, /* in - size of the virtual disk */ u32T bmtXtntPgs, /* in - number of pages per BMT extent */ u32T bmtPreallocPgs, /* in - number of pages to be preallocated for the BMT */ mlBfDomainIdT *bfDomainId, /* out - the domain id */ u32T *volIndex /* out - vol index */ );
• advfs_remove_volume() rmvol
The following example shows the argument list for advfs_remove_volume().
4-10 AdvFS System Calls and Kernel Interfaces
Describing Entries to AdvFS
Example 4-7: Prototype for advfs_remove_volume()
mlStatusTadvfs_remove_volume( mlBfDomainIdT bfDomainId, /* in */ u32T volIndex, /* in */ u32T forceFlag /* in */ );
• msfs_syscall_op_get_dmn_params() showfdmn
This example shows the msfs_syscall_op_get_dmn_params() argument list.
Example 4-8: Prototype for msfs_syscall_op_get_dmn_params()
mlStatusTmsfs_syscall_op_get_dmn_params( libParamsT *libBufp);
• msfs_syscall_op_get_dmn_vol_list()
FilesetsThis section specifies some of the routines associated with some common fileset- oriented commands. Calls related to filesets consist of:
• System call
• Utility
Routines include:
• msfs_fset_create() mkfset
The following example shows the argument list for msfs_fset_create() routine found in msfs_syscall.h.
Example 4-9: Prototype for msfs_fset_create()
mlStatusTmsfs_fset_create( char *domain, /* in - domain name */ char *setName, /* in - set’s name */ mlServiceClassT reqServ, /* in - required service class */ mlServiceClassT optServ, /* in - optional service class */ u32T userId, /* in - user id */ gid_t quotaId, /* in - group ID for quota files */ mlBfSetIdT *bfSetId /* out - bitfile set id */ );
AdvFS System Calls and Kernel Interfaces 4-11
Describing Entries to AdvFS
The other routines mentioned in this section are also prototyped in msfs_syscalls.h.
• msfs_fset_clone() clonefset
• msfs_fset_delete() rmfset
• msfs_set_bfset_params() chfsets
And many more.
Miscellaneous OperationsHere are a few more miscellaneous operations prototyped in msfs_syscalls.h:advfs_migrate(), which move blocks of an open file.
• msfs_syscall_op_set_bf_attributes()
Stripes a file
• msfs_undel_attach()
Attaches a trashcan directory
4-12 AdvFS System Calls and Kernel Interfaces
Starting Up and Recovering in AdvFS
Starting Up and Recovering in AdvFS
OverviewThis section discusses the logic behind some common AdvFS-specific activities. This material should serve as a guide to important routines supporting AdvFS.
• Startup and recovery overview
• Mounting the file system
• Activating the bitfile-set
• Activating the domain
• Recovering a domain
Startup and Recovery OverviewStarting up AdvFS involves several steps:
1. Begin with a mount(2) system call or vfs_mountroot() which does part of the job.
2. Invoke msfs_mount() found in msfs_vfsops.c.
3. Call get_domain_disks().
This searches the /etc/fdmns/domain for a list of virtual disks.
4. Call advfs_mountfs()(found in msfs_vfsops.c) to do the real work.
Mounting the File SystemTo mount the file system:
1. Obtain names of the fileset.
2. Activate the bitfile-set with bs_bfset_activate().
3. Initialize various in-memory structures.
4. Open significant bitfiles (tagdir, root, fragment).
5. Link the file system into mount list.
Source location: msfs/osf/msfs_vfsops.c
AdvFS System Calls and Kernel Interfaces 4-13
Starting Up and Recovering in AdvFS
ry
Activating the Bitfile-SetUse bs_bfset_activate_int() to activate or find a domain structure.
• bs_bfdmn_tbl_activate() finds the appropriate bitfile-set.
• bs_bfs_find_set() looks in the root tag directory.
Source location: msfs/bs/bs_bitfile_sets.c
Activating the Domain and Searching for Virtual DisksUse bs_bfdmn_tbl_activate()to search for virtual disks. If the domain is not active:
1. Search virtual disks of domain.
2. Check for consistencies:
— Virtual disk count on disk
— Number of links in /etc/fdmns
3. Find the transaction log.
4. Activate the domain using bs_bfdmn_activate().
Source location: msfs/bs/bs_domain.c
Activating the Domain: Full ActivationUse bs_bfdmn_activate() to activate the domain.
1. Open the transaction log using lgr_open().
2. Open root tag directory when appropriate.
3. Start crash recovery activities with ftx_bfdmn_recovery().
4. Remove delete-pending filesets.
Source location: msfs/bs/bs_domain.c
Recovering a DomainUse ftx_bfdmn_recovery() to recover a domain with one of three recovepasses:
• Pass 1 -- RBMT file
• Pass 2 -- Other reserved metadata bitfiles
• Pass 3 -- Other metadata bitfiles
After the three passes, perform any further recovery actions.
Source location: msfs/bs/ftx_recovery.c.
4-14 AdvFS System Calls and Kernel Interfaces
Starting Up and Recovering in AdvFS
Recovery Pass: Recovers Domain ConsistencyUse ftx_recovery_pass() to recover domain consistency.
To scan the log:
1. Read a record.
2. Put in slot for this FTX ID (and allocate new one if needed).
a. On pass 1, buffer continuation and root done record.
b. If record matches current pass, perform:
* Record image redo records.
* Operation redo record.
c. If level and member are zero, free the FTX slot.
3. Perform routine ftx_recovery_pass() of msfs/bs/ftx_recovery.c.
4. Loop through remaining FTX slots.
5. If level is not zero, this is part of an uncompleted transaction.
a. Fail the transaction.
b. Execute the undo records in pass appropriate manner.
6. If level is zero, better perform the root done operations.
AdvFS System Calls and Kernel Interfaces 4-15
Providing Storage Management
disk
tial
nt
Providing Storage Management
OverviewThis section discusses routines that provide storage allocation.
• Bitfile access subsystem (BAS) level storage allocation
• File access subsystem (FAS) level storage allocation
• Truncating bitfiles, fragment creation
BAS-Level Storage AllocationSome storage bitmap (SBM) information is cached in memory data structures:
• Disk free storage list:
— Starting address and size of free storage
— May not be large enough to hold all free storage locations, especially if is very fragmented
• BAS-level routines add storage:
— Without much regard to efficiency
— Although they will join adjacent grants into one extent (thus small sequenextents may become one)
Source locations:
• msfs/bs/bs_stg.c
• msfs/bs/bs_sbm.c
FAS-Level Storage AllocationOne concern of FAS-level storage is page-write clustering. If the file is being written sequentially, data space is preallocated in page sizes of:
• MIN (pg_to_write/4, MAX_PREALLOC_PAGES)
• pg_to_write is present page number
• MAX_PREALLOC_PAGES is presently 16
If this fails, data space is allocated as needed. BAS level will combine adjaceallocations.
Source location of fs_read_write_stg() is msfs/fs/fs_read_write.c.
4-16 AdvFS System Calls and Kernel Interfaces
Providing Storage Management
Truncating BitfilesAdvFS preallocates disk space to prevent multiple trips to the SBM information and to promote large extents. If the file write does not demand all preallocated disk pages, file truncation and possibly fragmentation will take place upon file close:
• When bitfile closes, AdvFS determines if last page should be allocated in the fragment file.
• If necessary:
— A fragment is allocated.
— Last page is now unused.
• If there are unused pages at end of file:
— Unused pages are deallocated.
— This can result in the release of small disk areas.
Source locations:
• fs_create_frag() in msfs/fs/fs_file_sets.c for file fragmentation.
• bf_setup_truncation() in msfs/fs/fs_create.c for file truncation.
AdvFS System Calls and Kernel Interfaces 4-17
Cloning
Cloning
OverviewCloning a fileset creates a read-only, pseudo copy (snapshot) of a file system usually for the purpose of online backups. This section discusses some routines used in cloning.
• Creating a clone
• Writing to a cloned original
• Reading from a clone
• Deleting bitfile from cloned original
Creating a CloneThis section introduces several routines involved in clone creation. The fs_fset_clone() routine performs various access checks.
The following example shows the fs_fset_clone() argument list.
Example 4-10: Prototype for fs_fset_clone()
/* * fs_fset_clone * * Creates a clone file set of an ’original’ file set. */
statusTfs_fset_clone( char *domain, /* in - name of set’s domain */ char *origSetName, /* in - name of orig set */ char *cloneSetName, /* in - name of new clone set */ bfSetIdT *retCloneBfSetId, /* out - clone set’s id */ long xid /* in - CFS transaction id */ )
The bs_bfs_clone ():
• Creates new bitfile-set.
• Copies original’s tagfile to clone’s tagfile.
• Makes appropriate modifications to bitfile-set attributes record.
Files open when cloning may not have perfect snapshots.
Source locations:
• fs_fset_clone() in msfs/fs/fs_file_sets.c.
• bs_bfs_clone()in msfs/bs/bs_bitfile_sets.c.
4-18 AdvFS System Calls and Kernel Interfaces
Cloning
Writing to a Cloned OriginalThis section lists the steps needed to handle an altered original:
• Bitfile pages of original are copy-on-write.
• On first modification of bitfile:
— New mcell is allocated for clone bitfile.
— Original and clone primary mcells are now different.
• On first modification of bitfile page:
— New extent is allocated for clone bitfile.
— Original data is copied to clone’s extent.
— Clone extent map has holes for original data.
• Source location in msfs/bs/bs_bitfile_sets.c:
— bs_cow_pg()
— bs_cow ()
— clone ()
Reading from a CloneUse the following sequence to read from a clone.
1. Determine if clone bitfile has requested page.
2. If not:
a. Determine if page really is within range of clone bitfile.
b. Check extent maps of original bitfile for page.
3. If a page is written into a hole of the original, the clone must be given a “permanent hole” extent.
Note that AdvFS optimizes I/O to the original bitfile-set, not to the clone.
AdvFS System Calls and Kernel Interfaces 4-19
Cloning
Deleting Bitfile from Cloned OriginalSequence needed to delete a file from the original fileset with the clone in existence:
1. Ensure data is available for clone after deletion from original fileset.
2. Original fileset is marked delete with clone.
It exists until clone fileset is deleted.
This is not the same as unlinking a file from fileset.
FAS-level understands multiple links for one file.
Source location: msfs/bs/bs_delete.c
Deleting a BitfileThe sequence for deleting a file is:
1. Set bitfile attributes state to BSRA_DELETING.
2. Delete the bitfile from the tagfile.
3. Add bitfile to deferred-delete list (DDL) for disk.
If system crashes, on recovery DDL is processed.
4. Wait for bitfile to close to reap the storage.
There are some variations of this process.
Source location: msfs/bs/bs_delete.c
Closing a Deleted BitfileThe final steps of the deletion happen when the file is closed:
1. Carefully delete the storage.
2. Perform a series of root transactions.
a. Pin several pages of SBM.
b. Update the storage bitmap to delete extents.
c. Update the delRst field of bitfile’s extent map to point to next extent todelete.
The disk storage delete code is found in del_dealloc_stg() and del_xtnt_array() in msfs/bs/bs_delete.c.
The mcell chain delete code is found in bmt_free_bf_mcells() in msfs/bs/bs_bmt.util.c.
Carefully delete the bitfile’s mcell chain.
4-20 AdvFS System Calls and Kernel Interfaces
Cloning
3. Perform a series of continued transactions.
a. Pin several pages of BMT.
b. Free the mcells on those pages.
c. Start a continuation transaction which knows next mcell to delete.
AdvFS System Calls and Kernel Interfaces 4-21
Migrating Files and Deleting Filesets
Migrating Files and Deleting Filesets
Overview This section describes the sequence of routines to accomplish file migration and fileset deletion. File migration takes place when the migrate command is used or the defragment command is used.
• Migrating a bitfile
• Deleting a fileset
Migrating a BitfileUse the following sequence to migrate a file:
1. Allocate new target storage.
2. Place target on deferred delete list.
If system crashes, it is gone on recovery.
3. Put target storage on copy extent map list.
Modifications will go to both source and target.
4. Copy blocks, source to target.
5. Flush blocks.
6. Switch roles on target and source.
Source will be reclaimed.
Source location: msfs/bs/bs_migrate.c
Deleting a FilesetUse the following sequence to delete a fileset:
1. Add bitfile-set to domain’s delete pending list.
2. Iterate through the tags of the bitfile-set.
3. Delete each bitfile.
4. Remove bitfile-set from bitfile-set delete pending list.
5. Delete tagfile.
Source location: bs_bfs_delete() is in msfs/bs/bs_bitfile_sets.c
4-22 AdvFS System Calls and Kernel Interfaces
Documenting Threads
Documenting Threads
OverviewThis section documents several kernel threads which are active behind the scenes to keep AdvFS in sync. Kernel threads are found under PID 0.
• AdvFS threads
• Fragment bitfile thread
• I/O thread
• AdvFS cleanup thread
AdvFS ThreadsThe following are some common characteristics of the AdvFS threads:
• Are created by kernel idle thread routine (PID 0)
• Receive typed messages on queue
• Block with cond_wait()
Source location: msfs/bs/bs_msg_queue.c
Fragment Bitfile ThreadFragment groups are trimmed periodically by the fragment bitfile thread (one per system):
• Deallocates frag groups of type 0 when there are too many
Target is AdvfsMinFragGrps (default is 16)
• Is awakened from frag_group_dalloc() with message containing bitfile-set ID
Kernel thread routine is bs_fragbf_thread() in msfs/bs/bs_bitfile_sets.c.
Default value for free fragment groups is 16. The value can be changed with sysconfigtab.
AdvFS System Calls and Kernel Interfaces 4-23
Documenting Threads
ge
I/O ThreadThis thread monitors its message queue for requests to trigger I/O:
• For START_MORE_IO messages:
— Calls bs_startio() for a virtual disk
— Is awakened by bs_osf_complete() when queue is small
• For LF_PB_CONT messages:
— Checks if a log flush continue or a pin block continue is needed
— Is awakened by bs_io_complete() if HiFlushLSN has changed
Source locations:
• msfs/bs/bs_qio.c for bs_io_thread()
• msfs/osf/msfs_io.c for bs_osf_complete()
AdvFS Cleanup ThreadVarious system routines communicate with this kernel thread using its messaqueue:
• For FINSH_DIR_TRUNC messages:
— Truncates space from directory
— Is awakened by routines to insert directory entries
• For CLEANUP_CLOSED_LIST messages:
— Moves bfAccess structures from closed to free list
— Awakened by routines which allocate bfAccess structures
Source location: msfs/fs/fs_dir_init.c
4-24 AdvFS System Calls and Kernel Interfaces
Summary
Summary
Describing Entries to AdvFSThe VFS switch table defines a series of pointers to functions implementing many of the file system level activities.
The vnode switch table defines a series of pointers to functions implementing most of the file-oriented AdvFS activities.
Data read from an AdvFS file system ultimately ends up in a memory location that is associated with the unified buffer cache.
AdvFS I/O must eventually cause device activity to begin. The device activity is triggered by device driver routines.
Starting Up and Recovering in AdvFSUse bs_bfset_activate_int() to activate or find a domain structure.
Use bs_bfdmn_tbl_activate()to search for virtual disks.
Use bs_bfdmn_activate() to activate the domain.
Use ftx_bfdmn_recovery() to recover a domain with one of three recovery passes:
Providing Storage Management Some SBM information is cached in memory data structures:
• Disk free storage list
• BAS-level routines add storage
One concern of FAS-level storage is page-write clustering. If the file is being written sequentially, data is preallocated in page sizes of:
• MIN (pg_to_write/4, MAX_PREALLOC_PAGES)
• pg_to_write is present page number
• MAX_PREALLOC_PAGES is presently 16
If this fails, data is allocated as needed. BAS-level will combine adjacent allocations.
AdvFS System Calls and Kernel Interfaces 4-25
Summary
ent
CloningCloning a fileset creates a read-only, pseudo copy (snapshot) of a file system usually for the purpose of online backups.
The fs_fset_clone() routine performs various access checks. The bs_bfs_clone ():
• Creates new bitfile-set.
• Copies original’s tagfile to clone’s tagfile.
• Makes appropriate modifications to bitfile-set attributes record.
Files open when cloning may not have perfect snapshots.
Migrating Files and Deleting FilesetsFile migration takes place when the migrate command is used or the defragmcommand is used. To migrate a file:
1. Allocate new target storage.
2. Place target on deferred delete list.
3. Put target storage on copy extent map list.
4. Copy blocks, source to target.
5. Flush blocks.
6. Switch roles on target and source.
To delete a fileset:
1. Add bitfile-set to domain’s delete pending list.
2. Iterate through the tags of the bitfile-set.
3. Delete each bitfile.
4. Remove bitfile-set from bitfile-set delete pending list.
5. Delete tagfile.
Documenting ThreadsAdvFS threads are created by the kernel thread routine.
• Fragment bitfile thread: One per system; deallocates frag groups of type 0
• I/O thread: For START_MORE_IO and LF_PB_CONT messages
• FS cleanup thread: For FINSH_DIR_TRUNC and CLEANUP_CLOSED_LIST message
4-26 AdvFS System Calls and Kernel Interfaces
Exercises
Exercises
Labs for this section involve reading any of the source code specified in the student materials (if it is available). The instructor will suggest several routines.
AdvFS System Calls and Kernel Interfaces 4-27
Solutions
Solutions
If you are not confident using the C programming language, this code reading can be done as a group. Routine fs_fset_clone() found in fs_file_sets.c may be a good starting point.
4-28 AdvFS System Calls and Kernel Interfaces
5
Troubleshooting AdvFS
Troubleshooting AdvFS 5-1
About This Chapter
About This Chapter
IntroductionThis chapter presents some AdvFS tips and hints on troubleshooting. It also provides a discussion of some known AdvFS issues/problems as well as a number of actual troubleshooting case studies.
ObjectivesTo troubleshoot AdvFS you should be able to:
• Identify commands, tools, and practices to isolate the problem.
• Examine a sample problem and identify possible solutions.
ResourcesFor more information on topics in this chapter as well as related topics, see the following:
• Advanced File System Administration
• Tru64 UNIX System Configuration and Tuning
• Tru64 UNIX AdvFS Reference pages
Case Study FormatThe case studies found in this chapter are presented using the following format:
• Problem statement
• Configuration
• Problem description
• Analysis
• Things attempted
• Final solution/summary
5-2 Troubleshooting AdvFS
Describing AdvFS Troubleshooting Practices
Describing AdvFS Troubleshooting Practices
AdvFS Commands and UtilitiesThis table provides a summary of AdvFS troubleshooting-related commands. See the AdvFS commands in the Appendix book for more detailed information on these commands.
Table 5-1: AdvFS Commands and Utilities
Command Function
addvol Adds a volume to an existing file domain.
advfsstat Displays performance statistics.
advscan Locates AdvFS volumes (disk partitions or LSM disk groups) that are in AdvFS domains.
balance Balances the percentage of used space among volumes in a domain.
chfile Changes attributes of an AdvFS file.
chfsets Changes fileset quotas (file usage limits and block usage limits).
chvol Changes attributes of a volume in an active domain.
defragment Makes the files on a disk more contiguous.
migrate Moves a file or file pages to another volume in an AdvFS domain.
mkfdmn Creates a new AdvFS file domain.
mkfset Creates an AdvFS fileset.
mountlist Checks for mounted AdvFS filesets.
ncheck Lists i-number or tag and path for all files in a file system.
nvbmtpg Displays pages of AdvFS bitfile metadata table (BMT) file; new command in Tru64 UNIX V5.0.
nvfragpg Displays the pages of an AdvFS fragment file; new command in Tru64 UNIX V 5.0.
nvlogpg Displays the log file of an AdvFS file domain; new command in Tru64 UNIX V 5.0.
nvsbmpg Displays a page of the storage bitmap (SBM) file; new command in Tru64 UNIX V 5.0.
nvtagpg Displays a page formatted as a tag file page; new command in Tru64 UNIX V 5.0.
rmfdmn Removes a file domain.
rmfset Removes a fileset or a clone fileset from an AdvFS file domain.
rmvol Removes a volume from an existing file domain.
salvage Recovers file data from damaged AdvFS file domains; new command in Tru64 UNIX Version 5.0 release. Versions are available for previous releases of Tru64 UNIX.
shblk Displays unformatted disk blocks.
shfragbf Displays how much space is used on the fragment file.
Troubleshooting AdvFS 5-3
Describing AdvFS Troubleshooting Practices
Troubleshooting Tips and PracticesHere are some general troubleshooting practices and procedures you should consider when investigating an AdvFS problem.
Describe the problem (and any relevant circumstances) in as much detail as possible. Include in this description the answers to these types of questions:
• How often has the problem occurred?
• Is the problem reproducible?
• When was the last time this feature worked properly?
showfdmn Displays the attributes of a file domain and detailed information about each volume in the file domain.
showfile Displays the attributes of AdvFS directories and files.
showfsets Displays information about filesets in an AdvFS domain.
stripe Stripes a file across several volumes in a file domain.
switchlog Moves an AdvFS file domain transaction log.
tag2name Displays the path name of a file given the tag number.
vbmtchain Displays metadata for a file including the time-stamp, extent map, and whether the file is a user directory or data file.
vbmtpg Displays a complete, formatted page of the BMT for a mounted or unmounted domain.
vdf Displays disk information for AdvFS domains and filesets; new command in the Tru64 UNIX Version 5.0 release.
vdump, rvdump
Performs full and incremental backups on filesets.
verify Checks on-disk structures such as the BMT, storage bitmaps, tag directory, and the fragment file for each fileset; included in Tru64 UNIX Version 4.0 and higher.
vfile Displays the contents of a file from an unmounted domain.
vfilepg Displays pages of an AdvFS file.
vfragpg Displays a single header page of a fragment file.
vlogpg Translates a 16-block part of a volume of an unmounted file system and formats it as a log page.
vlsnpg Displays the logical sequence number (LSN) of a page of the log.
vrestore, rvrestore
Restores files from savesets produced by vdump and rvdump.
vsbmpg Displays a page from a storage bitmap (SBM) file.
vtagpg Displays a formatted page of a file.
Table 5-1: AdvFS Commands and Utilities (Continued)
Command Function
5-4 Troubleshooting AdvFS
Describing AdvFS Troubleshooting Practices
• What steps led to the problem?
• Is there anything else that is not behaving normally (whether related or not)?
• What (if any) parts of the system that might be relevant are working as expected?
• Was anything happening physically close to the system at the time the problem appeared? For example, was a cable unintentionally disconnected?
• Has anything changed either in the hardware or software configuration? For example, changes in hardware, software/firmware updates, patches, or installations?
• If any changes have taken place, can the configuration be restored to its original state to determine whether the problem is still present? (Note: changes should be made sparingly and in as logical a sequence as possible to reduce the number of possible factors influencing or potentially masking the problem.)
Check for hardware-related causes of the problem.
• Any obvious problems with cables, connectors or terminators. For example, are the cables too long? Are there any bent pins? Is the hardware seated properly?
• Check hardware and firmware revision levels (sys_check mentioned below will look for these).
• Does the problem move with the hardware?
Check these locations for any error messages:
• binary.errlog
• /var/adm/syslog.dated/datekern.log
• /var/adm/syslog.dated/datedaemon.log
• /var/adm/messages
The /var/adm/syslog.dated/<date>kern.log and the /var/adm/messages files are written to by the syslogd daemon. Many errors encountered on the system that are not hardware related will be written to the syslog facility and, depending upon the configuration of the /etc/syslog.conf file, will go to one or all of these logs. In addition to the kern.log, the daemon.log is another log used heavily by specialists troubleshooting ASE problems. kern.log and messages will normally contain more messages related to AdvFS-specific issues (for example, AdvFS I/O errors).
• Check the advfs_err(4) reference page to find a brief description based on an error number.
Troubleshooting AdvFS 5-5
Describing AdvFS Troubleshooting Practices
ws n its ery
tain T.
• Search CANASTA if a panic is involved.
CANASTA is a Compaq internal crash dump analysis tool being used world-wide inside Compaq to store and evaluate crash footprint information for OpenVMS Alpha, OpenVMS VAX and Tru64 UNIX system crashes.
CANASTA uses AI technology to provide solutions or additional troubleshooting information for system crash problems. The CANASTA tool is typically used in the Customer Service Center (CSC), but access to the CANASTA knowledge database is also available using the CANASTA Mail Server, TIMA STARS and COMET.
By using the AutoCLUE tool, customer crash dump information can be automatically sent to Compaq using DSNlink and will be analyzed using the DSNlink CLUE post-processor. Solution information, if available, can be automatically returned to the customer and/or included in the call handling system.
• If you think it might be a bug in the software, research the reported bugs and patches for potential similarities.
Use the following sources to look for any existing information:
— The Atlanta CSC UNIX Web page has links to many useful sources.
— COMET search for past cases: Integrated Problem Management Tool(IPMT)/QAR entries, blitzes and notes conferences.
— COMET is an intelligent web-based storage and retrieval tool which allousers to search large collections of documents. It differs from STARS isearch algorithms and available information. A prime feature, Smart Qudecides which subset of COMET's 500 databases is most likely to conthe information you want. An account is not required to access COME
— Search of blitzes database:
— Notes conferences:
— Patch READMEs
— AdvFS/LSM Focals
— AdvFS/LSM Manuals
— SPD
• Use system tools to check for problems.
— sys_check
sys_check is a useful ksh script that can help to debug or diagnose system problems. The script generates an HTML file of a Tru64 UNIX configuration. This script has been tested on DIGITAL UNIX Version 3.2*, and Version 4.0 systems.
5-6 Troubleshooting AdvFS
Describing AdvFS Troubleshooting Practices
— iostat
— Performance Manager
• Use AdvFS tools and utilities to check for and fix problems.
Troubleshooting AdvFS 5-7
Troubleshooting File System Corruption
Troubleshooting File System Corruption
OverviewThis section presents some general information regarding troubleshooting AdvFS file system corruption problems.
Generally, fixing an AdvFS corruption problem depends on what caused the corruption. It is important to analyze an AdvFS corruption problem to determine the root cause.
Recognizing File System CorruptionCustomers rarely have a problem recognizing and reporting file system corruption. Some symptoms of file system corruption a customer might report include:
• System panic
• Domain panic
• Corrupted data
• Unexpected behavior after entering ordinary commands on files in an AdvFS file system
Causes of AdvFS CorruptionAdvFS corruption is usually caused by one of the following reasons:
• Hardware problem
Hardware problems are the most common sources of AdvFS-related system panics. The most frequent cause of corruption in any file system is bad blocks on the physical disk. Another common cause is outdated firmware revisions.
• Uncontrolled system shutdown
AdvFS is generally robust enough to withstand unexpected system crashes or power outages, but may still cause corruption in certain cases.
• Software bugs in the AdvFS software
Software bugs can often be reproduced. AdvFS software bugs are usually fixed by patches. Any available, relevant patches should be applied in the initial stages of troubleshooting a problem. Available resources should be checked for relevant patches since it is not always obvious which patches might be relevant to AdvFS.
5-8 Troubleshooting AdvFS
Troubleshooting File System Corruption
No Valid File System Error Message Possible causes of this symptom include:
• Corrupted metadata, BMT, or transaction log possibly caused by a disk, controller or other hardware problem
• Software bug
Possible troubleshooting actions include:
• If there was a panic, search CANASTA.
• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before taking any other action.
• Run sys_check.
• Repair and/or restore from backup.
• Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. (Use the msfsck and the vchkdir command for DIGITAL UNIX Version 3.x systems.)
— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.
— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0. (Afield test version of salvage is available in DIGITAL UNIX Version 4.0D.)
• If applicable, ensure that the Logical Storage Manager (LSM) is started bychecking for the vold daemon. If necessary, start LSM.
• Check the links in /etc/fdmns/domainname directory for correctness.
• File an IPMT.
Mount File System Operation Crashes the SystemPossible causes of this symptom include:
• Corrupted metadata, BMT, or transaction log. Possibly caused by disk, controller, or other hardware problem
• Software bug
Troubleshooting AdvFS 5-9
Troubleshooting File System Corruption
k
Possible troubleshooting actions include:
• Analyze the system crash dump for insight towards determining the next troubleshooting step. Search CANASTA.
• Run sys_check.
• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before taking any other action.
• Attempt to execute the mount -d or mount -r commands to mount the file system.
This technique has been useful because it allows you to get a backup of the file system. The specialist should use caution when using either flag. The -d flag disables transaction logging. The -r flag mounts the file system with read-only access.
• Repair and/or restore from backup media.
— Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)
— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.
— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.
— File an IPMT.
Localized CorruptionLocalized corruption is often a situation that is tolerable. The verify utility may be useful in cases of local corruption.
Symptoms of localized corruption include:
• Normal file manipulations for a few files on the file system that do not worproperly.
• Customer notices AdvFS I/O errors in the messages or kern.log files.
• No problems mounting file system.
5-10 Troubleshooting AdvFS
Troubleshooting File System Corruption
tion. e
Possible causes include:
• Corrupted directory or files possibly caused by bad blocks
• CPU exceptions or other hardware problems.
• Uncontrolled system shutdown such as power failure or crash.
• Software bug.
Possible troubleshooting actions include:
• Check the binary errorlog for bad block replacements or other hardware events.
• If excessive, ensure the hardware problem is resolved before taking any other action.
• If the corruption is not increasing and remains localized, add a new volume to replace the volume experiencing the errors.
• Repair and/or restore from backup media.
— Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)
— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.
— Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.
• File an IPMT.
Generalized CorruptionGeneralized corruption is often a more serious situation than localized corrupThe verify utility is not generally useful in fixing generalized corruption. Timspent using verify on generalized corruption problems may be better spent running salvage or restoring from backup.
Symptoms of generalized corruption include:
• Normal file manipulations for many or all the files on a file system.
• Numerous AdvFS I/O errors in the messages or kern.log files.
• No problem mounting the filesets in the domain.
Troubleshooting AdvFS 5-11
Troubleshooting File System Corruption
Possible causes include:
• Corrupted metadata or fragment list possibly caused by bad blocks, CPU exceptions or other hardware problemS.
• Uncontrolled system shutdown such as power failure or crash.
• Software bug.
Possible troubleshooting actions include:
• Check the binary errorlog for bad block replacements or other hardware events. If excessive, ensure the hardware problem is resolved before taking any other action.
• You can try adding volumes and removing the volumes having problems. In the case of general corruption, this will probably not solve the problem. This process is time consuming with a large number of bad files.
Repair and/or restore from backup media.
• Repair domain structures using the verify utility. However, since verify will attempt to mount the filesets, a system panic will most likely occur. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)
• If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.
• Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.
• File an IPMT.
Domain PanicA domain panic has occurred when the domain goes offline and data is inaccessible to users of the system.
Possible causes include:
• Corrupted metadata, BMT, fragment list or transaction log possibly caused by bad blocks, CPU exceptions or other hardware problemS.
• The filesets in the domain will not mount.
• Software bug.
Possible troubleshooting actions:
• Check the binary errorlog for bad block replacements or other I/O errors. If excessive, ensure the hardware problem is resolved before any other action.
• Use the mount -d command to try to get data off when restoring any volumes that have I/O errors.
5-12 Troubleshooting AdvFS
Troubleshooting File System Corruption
Repair and/or restore from backup media.
• Repair domain structures using the verify utility for DIGITAL UNIX Version 4.x systems. However, since verify will attempt to mount the filesets, a system panic will most likely occur and verify will be unsuccessful in fixing the problem. (Use the msfsck and the vchkdir commands for DIGITAL UNIX Version 3.x systems.)
• If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.
• Only if both methods are unsatisfactory should you employ the salvage utility. salvage is a new utility available in Tru64 UNIX Version 5.0.
• File an IPMT.
Troubleshooting AdvFS 5-13
Resolving Known AdvFS Issues
Resolving Known AdvFS Issues
OverviewThis section highlights a few known AdvFS issues or problems and strategies to resolve or work around these issues.
Log Half-Full ProblemUnder some circumstances, in pre-version 5.0 releases of Tru64 UNIX, AdvFS can panic with the log half full error message.
The following situations are known to have a causal relationship with an AdvFS panic with the log half full error message:
• When a very large file truncate is performed (this can occur when a file is overwritten by another file or by an explicit truncate system call), and the fileset containing the file has a clone fileset.
• When very large, highly fragmented files are migrated. Files with greater than 40000 extents are at risk. A migrate operation is performed when running the defragment, balance, rmvol, or migrate AdvFS utilities.
Regardless of the cause, the problem can be addressed by either reducing file fragmentation or by increasing the size of the log.
Fixing Log Half-Full Problems: Reducing FragmentationFile fragmentation can be reduced by following these steps:
1. Performing a backup of the files, deleting them, and restoring
2. Running the defragment utility on the files.
Determining Appropriate Log SizeUse these guidelines to determine an appropriate log size.
Enter this showfile command to determine the number of extents:
showfile -x filename | grep extentCnt
Table 5-2: Log Size Guidelines
Number of Extents Recommended Log Size
40000 768
60000 1024
80000 1280
5-14 Troubleshooting AdvFS
Resolving Known AdvFS Issues
Fixing Log Half Full Problems: Increasing Log Size Using switchlogIf you have a spare partition, you can:
1. Add a spare volume to the domain.
2. Move the log to that volume.
3. Move the log back with an increase in size.
4. Remove the spare volume.
If you have a spare partition, follow these steps to increase the log size:
1. Enter the addvol command specifying the block device special file name of the disk that you are adding to the file domain and the domain name.
# addvol /dev/rz10b domain <== V4.x# addvol /dev/disk/dsk10b domain <== V5.x
2. Enter the showfdmn command specifying the domain name.
# showfdmn domain
The showfdmn command will display information similar to the following:
Id Date Created LogPgs Domain Name 31b8a083.00049136 Fri Jun 7 17:34:59 1996 512 small Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1 401408 0 100% on 128 128 /dev/rz11b 2L 262144 192 100% on 128 128 /dev/rz3b 3 393216 0 0% on 128 128 /dev/rz10b ---------- ---------- ------ 1056768 192 100%
3. Enter the switchlog command specifying the name of the domain and the number of the new volume to use for the log.
# switchlog domain 3
4. Enter the switchlog command specifying a larger log size with the -l option and the number of the volume to use for the log. (The -l option is undocumented.) This command essentially moves the log back with a larger log size.
# switchlog -l 1024 domain 2
5. Enter the following rmvol command to remove the spare volume.
# rmvol /dev/rz10b domain
Troubleshooting AdvFS 5-15
Resolving Known AdvFS Issues
Fixing Log Half Full Problems: Increasing Log Size Using mkfdmnAlternatively, you can set the log page size using the mkfdmn command as follows:
mkfdmn -l pages
The default number of pages for the log is 512.
The -l option is an enhancement to the AdvFS mkfdmn command included in DIGITAL UNIX Version 4.0d. If you use the mkfdmn command, the domain will be reinitialized and must be restored from backups. The log half full problem should be solved by using the switchlog command.
BMT ExhaustionIn DIGITAL UNIX Version 4.0c and earlier only, AdvFS file systems that:
• Consist primarily of small files (less than 8KB) and
• Regularly create and delete very large numbers (many hundreds) of these small files
can run out of metadata space (inode tables), causing misleading out of disk space errors. Internet news servers and mail servers are particularly prone to this problem.
In DIGITAL UNIX Version 4.0d, use space reservation to work around the BMT exhaustion problem. This problem is fixed in Tru64 UNIX Version 5.0.
Avoiding BMT ExhaustionBMT exhaustion is a problem only in DIGITAL UNIX Version 4.0C and earlier. The problem can still occur in DIGITAL UNIX Version 4.0D, but it is less likely to be due to space reservation. Preallocating the metadata immediately after the file domain is created and/or additional volumes are added avoids BMT exhaustion.
For DIGITAL UNIX Version 3.x, this preallocation can be accomplished by writing a script that creates and then deletes the estimated number of files that are expected to exist in the AdvFS domain. The files created by the script are empty.
For example, the following ksh script preallocates metadata for 1000 files:
integer f=1000 while ((f > 0)) do touch prealloc_$f f=f-1 done rm prealloc_*
5-16 Troubleshooting AdvFS
Resolving Known AdvFS Issues
For DIGITAL UNIX Version 4.x, this preallocation can be accomplished by using the -x and -p switches to the mkfdmn and addvol commands. This increases the number of file systems blocks (8K) to extend or preallocate the bitfile metadata table (BMT) respectively.
For example, the following steps show how the -x switch can create a new AdvFS file system containing two volumes in which both will extend their BMT by 2048 pages at a time.
1. Enter the mkfdmn command. Specifying the -o option overwrites an existing file domain, allowing you to recreate the domain structure. Specifying the -x option lets you set the number of pages by which the bitfile metadata table extent size grows. The default is 128 pages.
# mkfdmn -o -x 2048 /dev/vol/vol08 test_dmn
2. Enter the mkfset command to create a new fileset in the specified domain.
# mkfset test_dmn test
3. Enter the addvol command to add a volume to the specified domain. Using the -x option, you can set the number of pages (extent size) by which the bitfile metadata table grows. The default is 128 pages.
# addvol -x 2048 /dev/vol/vol09 test_dmn
For example, the following steps show how to use the -p switch to create a new AdvFS file system containing one volume in which the volume’s BMT is preallocated by 10240 pages.
1. Enter the mkfdmn command. Specifying the -p option lets you set the number of pages by which the bitfile metadata table is preallocated. There is no default.
# mkfdmn -p 10240 /dev/vol/vol06 test_domain
2. Enter the mkfset command to create a new fileset in the specified domain.
# mkfset test_domain test
For example, the following steps show how the -x and -p switches can be used together to create a new AdvFS file system containing one volume in which the volume’s BMT is preallocated by 4096 pages and will extend by 1024 pages.
1. Enter the mkfdmn command as follows:
# mkfdmn -p 4096 -x 1024 /dev/vol/vol06 test_domain
Troubleshooting AdvFS 5-17
Resolving Known AdvFS Issues
2. Enter the mkfset command to create a new fileset in the specified domain.
# mkfset test_domain test
BMT Extent Map AllocationsThe following table provides a general idea for different cases, based on the values given to mkfdmn or addvol of -x and -p, of what the extent map should look like, assuming the domain does not become fragmented before filling out the BMT extent map.
If the domain becomes fragmented before filling out the BMT extent map, the size of the extents (at some point between three and 683 extents) will diminish and be smaller than the default (or the value specified using the -x switch). It is not possible to predict those sizes because the file system will attempt to find the largest available hole to hold the extent. The size of this hole depends on the file system fragmentation at the time the system attempts to find the extent.
BMT Exhaustion: Fixing the Problem Two common task or command sequences for fixing a BMT exhaustion problem include:
1. Backup, mkfdmn/mkfset, restore from backup
This method generally restores files and metadata in the domain contiguously. In addition, you can preallocate metadata when making the file domain or adding volumes so that you can be sure that the BMT is contiguous. This strategy requires some down time for that domain while the operation is in progress. For a large domain, this could be a long period of time.
You could use the DIGITAL UNIX Version 3.x method of preallocating files before restoring. In DIGITAL UNIX Version 4.x, you could use the -x and -p flags.
2. addvol, rmvol
The addvol command adds another new volume to the domain with a fresh BMT that files will be migrated to.
Table 5-3: BMT Extent Map Allocations
Extent Default -x 1024 -p 20000 -x 2048 -p30000
1 2 2 2 2
2 128 1024 20000 30000
3 128 1024 128 2048
4 128 1024 128 2048
... ... ... ... ...
... ... ... ... ...
683 128 1024 128 2048
5-18 Troubleshooting AdvFS
Resolving Known AdvFS Issues
The addvol command also allows you to use the -x and -p options to alter the defaults for BMT. Using the -x flag, you can set the number of pages by which the BMT extent size grows. Using the -p flag, you can set the number of pages to preallocate for the BMT.
The rmvol command automatically migrates files and metadata to other volumes in the domain.
The result is similar to restoring from backups in that the metadata is written in a contiguous fashion, and is therefore defragmented. Defragmented metadata generally will not become exhausted as quickly.
These commands can be executed while the domain is online, minimizing the impact on the users of the domain.
Additional disk space is needed for the new volume. In an existing multivolume domain, you must add a volume equivalent in size to the largest volume in that domain.
Troubleshooting AdvFS 5-19
Case Study 1: RBMT Corruption
Case Study 1: RBMT Corruption
OverviewThis case study describes a RBMT corruption problem.
Problem Statement: Case Study 1The problem statement received from the customer was:
"A multivolume AdvFS domain will not mount."
Configuration: Case Study 1The configuration the customer experienced the problem with consisted of a standalone DECstation 255 running Tru64 UNIX V5.0. The disks were both internal to the machine and in a BA353 storage unit.
Problem Description: Case Study 1The customer claimed that he was unable to mount file systems within a particular domain (bruden_dom). Other domains were functioning normally. No filesets within the afflicted domain would mount. One of the file systems had an existing clone.
Domain panic messages and mail were repeating, as were console messages indicating domain panics.
AnalysisHere is the analysis of the problem.
1. Since other domains were functioning normally, the specialist deduced that the problem was localized to the bruden_dom volumes. The error log showed no recent bad block replacements.
# mount bruden_dom#bruce_fset /usr/brucebruden_dom#bruce_fset on /usr/bruce: I/O error# mount -d bruden_dom#bruce_fset /usr/brucebruden_dom#bruce_fset on /usr/bruce: I/O error
2. Attempt to gather information about the domain.
# ls -l /etc/fdmns/bruden_domtotal 0lrwxr-xr-x 1 root system 15 Sep 28 16:59 dsk0a -> /dev/disk/dsk0alrwxr-xr-x 1 root system 15 Sep 28 17:01 dsk0b -> /dev/disk/dsk0blrwxr-xr-x 1 root system 15 Sep 28 17:08 dsk2h -> /dev/disk/dsk2h # showfsets bruden_dombruce_fset
Id : 37f12c39.000263ea.1.8001
5-20 Troubleshooting AdvFS
Case Study 1: RBMT Corruption
Files : 6, SLim= 0, HLim= 0Blocks (512) : 68288, SLim= 50000, HLim= 200000 grc= noneQuota Status : user=off group=off
dennis_fsetId : 37f12c39.000263ea.2.8001Clone is : den_cloneFiles : 324, SLim= 0, HLim= 0Blocks (512) : 93114, SLim= 0, HLim= 400000Quota Status : user=off group=off
den_cloneId : 37f12c39.000263ea.3.8003Clone of : dennis_fsetRevision : 3
3. Check for and evaluate any logged error messages.
4. Try verify to check for and correct any corruption.
# verify -f bruden_domverify: can’t get set info for domain ’bruden_dom’verify: error = E_BAD_BMT (-1171)verify: can’t allocate memory for fileset mount_point arrayUnable to malloc an additional 0 bytes, currently using 0exiting...
5. verify indicates a problem in the BMT. Let’s see if salvage can give us some help or insight. salvage (ultimately) causes further domain panics.
# salvage bruden_domsalvage: Domain to be recovered ’bruden_dom’salvage: Volumes to be used ’/dev/disk/dsk0a’ ’/dev/disk/dsk0b’ ’/dev/disk/dsk2h’ salvage: Files will be restored to ’.’salvage: Logfile will be placed in ’./salvage.log’salvage: Starting search of all filesets: 13-Oct-1999 10:38:26salvage: Starting search of all volumes: 13-Oct-1999 10:38:26salvage: Loading file names for all filesets: 13-Oct-1999 10:38:26salvage: Starting recovery of all filesets: 13-Oct-1999 10:38:27you have mail in /usr/spool/mail/root # # mailFrom root Wed Oct 13 10:36:06 1999Received: by den255 id KAA01133; Wed, 13 Oct 1999 10:36:05 -0400 (EDT)Date: Wed, 13 Oct 1999 10:36:05 -0400 (EDT)From: system PRIVILEGED account <root>Message-Id: <199910131436.KAA01133@den255>Subject: EVM ALERT [600]: AdvFS: An AdvFS domain panic has occurred on bruden_dom
============================ EVM Log event ===========================EVM event name: sys.unix.fs.advfs.fdmn.panic
Troubleshooting AdvFS 5-21
Case Study 1: RBMT Corruption
This event is posted by the AdvFS filesystem to indicate that an AdvFS domain panic has occurred on the specified domain. This is due to either a metadata write error or an internal inconsistency. The domain is being rendered inaccessible.
Action: Please refer to the guidelines in the AdvFS Guide to File System Administration for the steps to recover this domain.
======================================================================
Formatted Message: AdvFS: An AdvFS domain panic has occurred on bruden_dom
Event Data Items: Event Name : sys.unix.fs.advfs.fdmn.panic Cluster Event : True Priority : 600 PID : 1131 PPID : 664 Event Id : 171 Member Id : 0 Timestamp : 13-Oct-1999 10:35:04 Host IP address : 192.206.126.27 Host Name : den255 Format : AdvFS: An AdvFS domain panic has occurred on $domain Reference : cat:evmexp.cat:450
Variable Items: domain = "bruden_dom"
6. The show commands confirm the BMT corruption.
# showfdmn bruden_domshowfdmn: unable to get info for domain ’bruden_dom’showfdmn: error = E_BAD_BMT (-1171) # # showfsets bruden_domshowfsets: can’t show set info for domain ’bruden_dom’showfsets: error = E_BAD_BMT (-1171)
7. Investigate potential known bugs and patches.
8. Elevate the problem to engineering. However, keep trying to solve the problem for the customer. See if the on-disk viewing tools can help.
# nvbmtpg -rR bruden_domBad mcell record 0: bCnt too large 63265get_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.
5-22 Troubleshooting AdvFS
Case Study 1: RBMT Corruption
MT.
9. Block 32 of the volume should contain the RBMT (see the on-disk module).Since we can’t seem to get anywhere, let’s look at the beginning of the RBThe first mcell should map the RBMT itself.
# vfilepg /dev/disk/dsk0a -b 32 Bad mcell record 0: bCnt too large 63265get_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.==========================================================================VOLUME "/dev/disk/dsk0a" (VDI 1) lbn 32 --------------------------------------------------------------------------004000 08 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 ............... 004010 06 00 00 00 01 00 00 00 fa ff ff ff 00 00 00 00 ................004020 fe ff ff ff 00 00 00 00 5c 00 02 00 03 00 00 00 ........\.......004030 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004040 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................004050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004080 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 ....P...........004090 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 ................0040a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040c0 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ................0040d0 ff ff ff ff 04 00 00 00 00 00 00 00 00 00 00 00 ................0040e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004130 00 00 00 00 00 00 00 00 00 00 00 00 f9 ff ff ff ................004140 00 00 00 00 fe ff ff ff 00 00 00 00 5c 00 02 00 ............\...004150 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 ................004160 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 ................004170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041a0 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 ........P.......0041b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041c0 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041e0 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 ............p...0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................
10. The event manager log shows many AdvFS domain panic messages.
# evmget | evmshowSystem startupASCII msg: Test for EVM connection of binlogdSystem timestampSystem shutdown msg: System halted by root: System startupASCII msg: Test for EVM connection of binlogdSystem timestamp
Troubleshooting AdvFS 5-23
Case Study 1: RBMT Corruption
System startupASCII msg: Test for EVM connection of binlogdSystem timestampAdvFS domain panicAdvFS domain panicAdvFS domain panicAdvFS domain panic
...
11. This page can be interpreted if we remember that each BMT (and RBMT) page starts with a 16-byte header followed by a series of mcells containing a variable number of records. Each mcell is 292 bytes and contains within it a 24-byte header. Try to determine what is wrong by looking at a good domain’s RBMT.
# showfdmn usr_domain
Id Date Created LogPgs Version Domain Name37da6652.0009f8d0 Sat Sep 11 10:25:22 1999 512 4 usr_domain
Vol 512-Blks Free % Used Cmode Rblks Wblks Vol Name 1L 1426112 201120 86% on 256 256 /dev/disk/dsk1g# # # vfilepg /dev/rdisk/dsk1g -b 32 ==========================================================================VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 --------------------------------------------------------------------------004000 08 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 ............... 004010 06 00 00 00 01 00 00 00 fa ff ff ff 00 00 00 00 ................004020 fe ff ff ff 00 00 00 00 5c 00 02 00 03 00 00 00 ........\.......004030 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004040 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................004050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004080 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 ....P...........004090 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 ................0040a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040c0 01 00 02 00 00 00 00 00 20 00 00 00 01 00 00 00 ........ .......0040d0 ff ff ff ff 04 00 00 00 00 00 00 00 00 00 00 00 ................0040e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0040f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004130 00 00 00 00 00 00 00 00 00 00 00 00 f9 ff ff ff ................004140 00 00 00 00 fe ff ff ff 00 00 00 00 5c 00 02 00 ............\...004150 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 ................004160 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 ................004170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................004190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041a0 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 ........P.......
5-24 Troubleshooting AdvFS
Case Study 1: RBMT Corruption
0041b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041c0 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................0041e0 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 ............p...0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................# # nvbmtpg -rR usr_domain 1 0 ==========================================================================DOMAIN "usr_domain" VDI 1 (/dev/rdisk/dsk1g) lbn 32 RBMT page 0--------------------------------------------------------------------------CELL 0 next mcell volume page cell 1 0 6 bfSetTag,tag -2,-6(RBMT)
RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 32 (0x20)bsXA[ 1] bsPage 1 vdBlk -1
--------------------------------------------------------------------------CELL 1 next mcell volume page cell 0 0 0 bfSetTag,tag -2,-7 (SBM)
RECORD 0 bCnt 92 BSR_ATTRtype BSRA_VALID
RECORD 1 bCnt 80 BSR_XTNTStype BSXMT_APPEND chain mcell volume page cell 0 0 0firstXtnt mcellCnt 1 xCnt 2bsXA[ 0] bsPage 0 vdBlk 112 (0x70)bsXA[ 1] bsPage 2 vdBlk -1
(...)
12. Determine the difference between the two page 32s.
# vfilepg /dev/rdisk/dsk1g -b 32 > /tmp/dsk1g# # vfilepg /dev/rdisk/dsk0a -b 32 > /tmp/dsk0aget_stripe_parms: Bad mcell RBMT vol 1 page 0 cell 4: No BSR_XTNTS record in primary mcell.
# diff dsk0a dsk1g1d0< Bad mcell record 0: bCnt too large 632653c2< VOLUME "/dev/rdisk/dsk0a" (VDI 1) lbn 32 ---> VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 17c16< 0040c0 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ................---> 0040c0 01 00 02 00 00 00 00 00 20 00 00 00 01 00 00 00 ........ .......36c35
Troubleshooting AdvFS 5-25
Case Study 1: RBMT Corruption
as e. was lem.
< 0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................---> 0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................#
13. The byte containing the hex 00 (bolded) should contain a hex 20. The customer agreed to try a fix to the suspected corrupted byte. The specialist wrote a program to insert a hex 20 (decimal 32) in the raw volume file at the correct location. This turned out to be the LBN field of the RBMT’s extent field. It wsupposed to contain a 32 indicating block 32 is where to find the RBMT filThe corrupted RBMT had a 00 where it should have had a 20 (hex). The fixtried and worked. Disk corruption was ultimately determined to be the prob
# cat putbyte.c#include <stdio.h>#include <stdlib.h>#include <fcntl.h>#include <sys/stat.h>#include <sys/types.h>#include <unistd.h>
#define READ_COUNT 512
int main(void){
int fd, ret, count = READ_COUNT, i = 0;int offset = 32*512;long off = 0;long targ = 0;char buf[READ_COUNT];ssize_t size;
fd = open("/dev/rdisk/dsk0a",O_RDWR);
if(fd == -1){
perror("open problem");exit(EXIT_FAILURE);
}
off = lseek(fd,offset, SEEK_SET);
if(off == -1){
perror("seek problem");exit(EXIT_FAILURE);
}
printf("offset (off) is %d, %x\n",off,off);
size = read(fd, buf, count);
if(size == -1)
5-26 Troubleshooting AdvFS
Case Study 1: RBMT Corruption
{perror("read trouble");exit(EXIT_FAILURE);
}
while( i < 512){
printf("%02x ",buf[i++]);if((i%16) == 0) printf("\n");
} printf("byte to change in hex is %04x.\n",*(buf+200));
*(buf+200) = (char)0x20;
printf("Changed byte is %04x.\n",*(buf+200));
off = lseek(fd,offset, SEEK_SET);
ret = write(fd, buf, READ_COUNT);if(ret == -1){
perror("read trouble");exit(EXIT_FAILURE);
}}# # # cc -o putbyte putbyte.c# # # ./putbyteoffset (off) is 16384, 400008 00 00 00 00 00 00 00 13 00 00 00 00 00 00 20 06 00 00 00 01 00 00 00 fffffffa ffffffff ffffffff ffffffff 00 00 00 00 fffffffe ffffffff ffffffff ffffffff 00 00 00 00 5c 00 02 00 03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 00 00 00 00 00 00 00 00 00 01 00 00 00 ffffffff ffffffff ffffffff ffffffff 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fffffff9 ffffffff ffffffff ffffffff 00 00 00 00 fffffffe ffffffff ffffffff ffffffff 00 00 00 00 5c 00 02 00
Troubleshooting AdvFS 5-27
Case Study 1: RBMT Corruption
03 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 02 00 00 00 00 00 70 00 00 00 01 00 00 00 ffffffff ffffffff ffffffff ffffffff 04 00 00 00 00 00 00 00 byte to change in hex is 0000.Changed byte is 0020.# # # vfilepg /dev/rdisk/dsk1g -b 32 > /tmp/dsk1g# # vfilepg /dev/rdisk/dsk0a -b 32 > /tmp/dsk0a# # # diff dsk0a dsk1g2c2< VOLUME "/dev/rdisk/dsk0a" (VDI 1) lbn 32 ---> VOLUME "/dev/rdisk/dsk1g" (VDI 1) lbn 32 35c35< 0041f0 01 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................---> 0041f0 02 00 00 00 ff ff ff ff 04 00 00 00 00 00 00 00 ................# # # mount bruden_dom#bruce_fset /usr/bruce# # # dfFilesystem 512-blocks Used Available Capacity Mounted on/dev/disk/dsk1a 644808 268766 311560 47% //proc 0 0 0 100% /procusr_domain#usr 1426112 1026492 200464 84% /usrusr_domain#var 1426112 169516 200464 46% /varbruden_dom#bruce_fset 50000 22788 27212 46% /usr/bruce# # # cd /usr/bruce# # # ls.tags big5 quota.user sm2big4 quota.group sm1# # # ls -ltotal 11402drwx------ 2 root system 8192 Sep 28 17:04 .tags-rwxr-xr-x 1 root system 11646960 Sep 28 17:21 big4
5-28 Troubleshooting AdvFS
Case Study 1: RBMT Corruption
-rwxr-xr-x 1 obrien system 0 Sep 28 17:24 big5-rw-r----- 1 root operator 8192 Sep 28 17:04 quota.group-rw-r----- 1 root operator 8192 Sep 28 17:04 quota.user-rw-r--r-- 1 root system 5 Oct 13 10:26 sm1-rw-r--r-- 1 root system 10 Oct 13 10:27 sm2# # # nvbmtpg -rR bruden_dom ==========================================================================DOMAIN "bruden_dom" VDI 1 (/dev/rdisk/dsk0a) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "bruden_dom" VDI 2 (/dev/rdisk/dsk0b) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.==========================================================================DOMAIN "bruden_dom" VDI 3 (/dev/rdisk/dsk2h) lbn 32 RBMT page 0--------------------------------------------------------------------------There is 1 page in the RBMT on this volume.There are 19 free mcells in the RBMT on this volume.#
Troubleshooting AdvFS 5-29
Case Study 2: Fragment-Free List Corruption
er the in.
Case Study 2: Fragment-Free List Corruption
OverviewThis case study describes a 1K fragment-free list corruption problem.
Problem Statement: Case Study 2The problem statement received from the customer was:
"Following the creation of a new file on an existing AdvFS domain, it is noticed that many other files on the same domain now contain the same data."
Configuration: Case Study 2The configuration on which the customer experienced the problem consisted of an AlphaServer 2100 running DIGITAL UNIX Version 3.2d.
Problem Description: Case Study 2Customer creates a new file on one of the AdvFS domains, and many other files on the domain show up containing the same data. The problem is reproducible on the customer’s system.
AnalysisHere is the analysis of the problem:
1. The syslog and binlog files were checked for any hardware or system problems that may have lead to the corruption. None were found.
2. Perform some testing on the customer’s system and analyze results.
The specialist created different files of various sizes and checked to see whethdata for those files was equivalent to some number of other files on the doma
For example, the file sal already existed in the domain (file sal was small and recently created). The specialist created a new file called jim in the same domain and entered the data JUNK1234 in the file. Listing the contents of the file sal showed it now contained the same data that had been entered for file jim.
eagles [207] # cat > jimJUNK1234eagles [208] # cat salJUNK1234
5-30 Troubleshooting AdvFS
Case Study 2: Fragment-Free List Corruption
Entering the ls -li command for both files lists these characteristics:
• i-number
• Access rights
• Size (in bytes)
• Owner
• Group
• Time of last modification for each file
• File name
The modification of the contents of file sal to be identical to the contents of the newly created file jim was not recorded as the last file modification.
eagles [209] # ls -li jim sal 623 -rw-r--r-- 1 root dba 8 Oct 2 12:34 jim 567 -rw-r--r-- 1 sal dba 8 Sep 24 18:31 sal
The testing indicated that the problem manifested itself only for new, small files (<1K) created on the domain. In addition, there were multiple filesets in the domain, all of which exhibited the same behavior.
Since only new files were manifesting the problem, this pointed to a recent corruption. The size of the files being affected pointed to a problem with the allocation of small (1K) fragments from the fragment-free list for this domain.
3. Enter the AdvFS showfile command with the -x qualifier specifying both files.
The showfile command displays the attributes of one or more AdvFS files. The -x qualifier displays the full storage allocation map (extent map) for the specified files. See the AdvFS Commands Appendix for more information on this command.
eagles [212] # showfile -x jim sal Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 26f.8002 2 16 0 simple ** ** off 100% jim extentMap: 1 pageOff pageCnt vol volBlock blockCnt extentCnt: 0 Id Vol PgSz Pages XtntType Segs SegSz Log Perf File 237.8017 3 16 0 simple ** ** off 100% sal extentMap: 1 pageOff pageCnt vol volBlock blockCnt extentCnt: 0
A value of 0 for the number of extents (extentCnt) for both files indicates that neither small file has any extents. AdvFS writes files to disk in sets of 8KB pages. When a file uses only part of the last page, less than 8KB, a file fragment is created.
Troubleshooting AdvFS 5-31
Case Study 2: Fragment-Free List Corruption
The fragment, which is from 1KB to 7KB in size, is allocated from the fragment file. Using fragments reduces the amount of unused, wasted disk space. The fragment file is a special file not visible in the directory hierarchy.
Given the size of these files, we expect that AdvFS allocated space for them from the 1K fragment-free list.
4. Enter the AdvFS shfragbf command.
The shfragbf command displays how much space is used on the fragment file. See the AdvFS Commands Appendix for more information on this command.
eagles [235] # /usr/field/shfragbf -t 1 -v /decsave/.tags/1 |more--group pg = 0, next pg = -1, type = 1 nextFree = 43, numFree = 18frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43frag on free list : 43, nextFree = 43
...
The output of the shfragbf command repeated the same values from this point onward. Therefore, any new fragment allocated from the 1K fragment-free list is receiving the same block (43) every time. Files that have already been allocated block 43 will all be pointing to the same fragment in the same block. Changing the data located at block 43 therefore causes an update of the file content for all the files pointing to that block.
Issuing the shfragbf command displays output similar to:
# /usr/field/shfragbf -t 1 -v /.tags/1--group pg = 64, next pg = 112, type = 1 nextFree = 66, numFree = 34frag on free list : 578, nextFree = 601frag on free list : 601, nextFree = 602frag on free list : 602, nextFree = 603frag on free list : 603, nextFree = 604[snip]frag on free list : 624, nextFree = 623frag on free list : 623, nextFree = -1
5. The msfsck and vchkdir tools were run against the domain and reported no other domain corruption.
5-32 Troubleshooting AdvFS
Case Study 2: Fragment-Free List Corruption
mple, thout e
no ain and
Things Attempted: Case Study 21. A database search was done to see if the problem was seen before. A match was
not found.
2. The latest patches were applied to the system. Patches do not generally clean up corruption problems, but will hopefully prevent future occurrences of known issues.
Final Solution/Summary: Case Study 2The customer had to remake the domain and restore from backups. This was a localized corruption that was not correctable with available tools.
The customer’s highest priority was to return to production operation with thedomain. The customer was less interested that the problem be fixed (for exathat a patch be developed and/or problem further analyzed by engineering) wihaving to remake the domain, than they were in receiving some assurance thproblem would not immediately reoccur due to some underlying hardware or general problem.
The problem was analyzed to the point where it was fairly certain there were hardware issues and the customer could be reassured that recreating the domrestoring the files would produce a clean file system.
The underlying cause of the problem was never determined.
Troubleshooting AdvFS 5-33
Case Study 3: Corruption and System Panic
Case Study 3: Corruption and System Panic
OverviewThis case study describes an AdvFS file system corruption problem and resultant system panic.
Problem Statement: Case Study 3The problem statement received from the customer was:
"AdvFS file systems become corrupted following upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a".
Configuration: Case Study 3The configuration on which the customer experienced the problem consisted of an AlphaServer 2100 running DIGITAL UNIX Version 4.0a and the Prestoserve option. Prestoserve is a nonvolatile hardware cache used to speed up synchronous access to file systems. This option consists of a hardware card and a software driver and utilities. The system was the main NFS server for the campus.
Non-Compaq (Seagate) disk drives were used in the configuration.
Problem Description: Case Study 3After an upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a, the system periodically corrupts files on AdvFS domains.
The customer became aware of the problem when the system reported file not found errors in response to an ls command on known, existing files.
The nature of the problem statement did not help isolate which subsystem (AdvFS, Presto, or NFS) might have been causing the corruption.
Since the customer reported CPU exceptions and Compaq disk drives were not in the configuration, hardware could also have been contributing to the problem. The disk drives were also suspected of causing problems with the device driver since the method for detecting devices on the system had changed dramatically (for example, Dynamic Device Recognition - DDR). The DDR interface will make assumptions about some of the parameters of the disk drives. In the case of the Seagate drives, the question was whether the assumption being made about Tagged Command Queueing may not have been correct, or that the feature was not correctly implemented on the drive. The firmware provided for Compaq disk drives ensures that they always work in a specific manner. We cannot be certain of the firmware implementation in disk drives not used by Compaq.
5-34 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
Analysis1. The logs were checked.
Hardware CPU exceptions were reported in the binary errlog.
2. Performed some testing to investigate the corruption.
a. Enter the ls -l command for a known file on /usr10.
# ls -l /usr10/vogar/cs2005/poly2/usr10/vogar/cs2005/poly2 not found
The results indicate the file is not found.
b. Enter the mount | grep command for /usr10 to find the domain name.
# mount |grep /usr10disk7#usr10 on /usr10 type advfs (rw, quota)
The results tell us that the domain name is disk7.
c. Enter the ls -l command for the /etc/fdmns directory specifying the disk7 domain.
# ls -l /etc/fdmns/disk7total 0lrwxrwxrwx 1 root wheel 10 Oct 21 10:26 rz12c -> /dev/rz12c
The results provide additional information about the problem domain. At this We found that this was not a Compaq device (Seagate drive) and therefore possibly a device driver issue. Bad blocks on the disk are the most common cause for corruption, so look for those on this disk.
3. Possible hardware causes of the corruption were investigated.
The CPU exceptions were examined and a CPU was replaced.
The previous CPU exceptions could have caused corruption in the AdvFS structures.
4. The vdump command was used to perform a full backup. The vrestore command was used to restore the files from the savesets produced by vdump.
The full vdump and vrestore was performed to ensure data integrity after the corruption.
5. Run the verify command specifying the -d flag (to remove the corrupted files) on the domain.
The verify command checks on-disk structures such as the bitfile metadata table (BMT), the storage bitmaps, the tag directory and the fragment file for each fileset. In this case, using the -d option, also temporarily cleared the corruption.
Troubleshooting AdvFS 5-35
Case Study 3: Corruption and System Panic
The corrupted files that were originally present were deleted by verify. However, additional files subsequently became corrupted on several different domains within a few hours after running at DIGITAL UNIX Version 4.0a with the Prestoserve option enabled.
At one point, the domains became so corrupted that attempting to mount them or repair the corruption resulted in a system panic.
6. Contact engineering to reassess the problem.
The problem was entered into the Integrated Problem Management Tool (IPMT), a Web-based tracking tool for escalating customer problems to USEG.
An AdvFS file system corruption was confirmed. Using the verify command to correct the problem was only partially successful. At this point it was still not determined which subsystem the problem was in. The panics came from AdvFS, but that was a symptom and did not point to the source of corruption. Crash dumps from the system panics were obtained and provided to engineering.
The crash dumps that occurred were all of the same type, ultimately helping to identify and reproduce the problem.
The following is a representative crash-data file for the panics:
## Crash Data Collection (Version 1.4)#_crash_data_collection_time: Sat Nov 9 12:34:16 EST 1996_current_directory: /_crash_kernel: /var/adm/crash/vmunix.12_crash_core: /var/adm/crash/vmcore.12_crash_arch: alpha_crash_os: Digital UNIX_host_version: Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996_crash_version: Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no_crashtime: struct { tv_sec = 847560441 tv_usec = 435137}_boottime: struct { tv_sec = 847559468 tv_usec = 118564}_config: struct { sysname = "OSF1" nodename = "res.WPI.EDU" release = "V4.0" version = "464" machine = "alpha"}_cpu: 35_system_string: 0xffffffffff8010b8 = "AlphaServer 2100 4/200"_ncpus: 3_avail_cpus: 3
5-36 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
_partial_dump: 1_physmem(MBytes): 319_panic_string: 0xffffffff94ccb328 = "bad v1 frag free list"_paniccpu: 0_panic_thread: 0xfffffc0000e0a840_preserved_message_buffer_begin:struct { msg_magic = 0x63061 msg_bufx = 0xb84 msg_bufr = 0xa3e msg_bufc = "Alpha boot: available memory from 0xc88000 to 0x13ffe000Digital UNIX V4.0A (Rev. 464); Fri Oct 25 18:54:14 EDT 1996physical memory = 320.00 megabytes.available memory = 307.56 megabytes.using 1221 buffers containing 9.53 megabytes of memoryMaster cpu at slot 0.Firmware revision: 4.6PALcode: OSF version 1.45ibus0 at nexusAlphaServer 2100 4/200cpu 0 EV-4s 1mb b-cachecpu 1 EV-4s 1mb b-cachecpu 2 EV-4s 1mb b-cachegpc0 at ibus0pci0 at ibus0 slot 0tu0: DECchip 21040-AA: Revision: 2.2tu0 at pci0 slot 0tu0: DEC TULIP Ethernet Interface, hardware address: 08-00-2B-E2-65-1Ctu0: console mode: selecting 10BaseT (UTP) port: half duplex: no linkpsiop0 at pci0 slot 1Loading SIOP: script 1000000, reg 81000000, data 100df38scsi0 at psiop0 slot 0rz0 at scsi0 target 0 lun 0 (LID=0) (DEC RZ28 (C) DEC D41C)rz1 at scsi0 target 1 lun 0 (LID=1) (SEAGATE ST32550N 0012)rz2 at scsi0 target 2 lun 0 (LID=2) (SEAGATE ST15150N 0017)rz3 at scsi0 target 3 lun 0 (LID=3) (SEAGATE ST15150N 0017)eisa0 at pci0ace0 at eisa0ace1 at eisa0lp0 at eisa0fdi0 at eisa0fd0 at fdi0 unit 0fd1 at fdi0 unit 1qvision0 at eisa0qvision0: CMPQ Qvision 1024/E SVGAtu1: DECchip 21140-AA: Revision: 1.2tu1 at pci0 slot 6tu1: DEC Fast Ethernet Interface, hardware address: 00-00-F8-02-8B-0Etu1: console mode: selecting 100BaseTX (UTP) port: full duplexpnvram0: Module Revision 16, Cache size: 8387584pnvram0 at pci0 slot 7pnvram_ssn: NO System Serial Numberpresto: NVRAM tested readonly okpresto: using 8387584 bytes of NVRAM at 0xc0000400presto: primary battery ok
Troubleshooting AdvFS 5-37
Case Study 3: Corruption and System Panic
psiop1 at pci0 slot 8Loading SIOP: script 102c000, reg 81000100, data 40608338scsi1 at psiop1 slot 0rz8 at scsi1 target 0 lun 0 (LID=4) (SEAGATE ST12550N 0013)rz9 at scsi1 target 1 lun 0 (LID=5) (SEAGATE ST12550N 0013)rz10 at scsi1 target 2 lun 0 (LID=6) (SEAGATE ST12550N 0013)rz11 at scsi1 target 3 lun 0 (LID=7) (DEC RRD45 (C) DEC 0436)rz12 at scsi1 target 4 lun 0 (LID=8) (SEAGATE ST15150N 0017)tz14 at scsi1 target 6 lun 0 (LID=9) (DEC TLZ06 (C)DEC 4BQE)lvm0: configured.lvm1: configured.kernel console: qvision0dli: configuredADVFS: using 2907 buffers containing 22.71 megabytes of memoryvm_swap_init: warning /sbin/swapdefault swap device not foundvm_swap_init: swap is set to lazy (over commitment) modeStarting secondary cpu 1Starting secondary cpu 2SuperLAT. Copyright 1994 Meridian Technology Corp. All rights reserved.tu1: transmit FIFO underflow: threshold raised to: 256 bytesrfs_dispatch: sendreply failedrfs_dispatch: sendreply failedADVFS EXCEPTIONModule = bs_bitfile_sets.c, Line = 1511bad v1 frag free listpanic (cpu 0): bad v1 frag free listsyncing disks... 14 donedevice string for dump = SCSI 0 1 0 0 0 0 0.DUMP.prom: dev SCSI 0 1 0 0 0 0 0, block 131072device string for dump = SCSI 0 1 0 0 0 0 0.DUMP.prom: dev SCSI 0 1 0 0 0 0 0, block 131072"}_preserved_message_buffer_end:_kernel_process_status_begin: PID COMM00000 kernel idle00001 init00003 kloadsrv00039 update00127 syslogd00129 binlogd00316 portmap00324 ypbind00333 mountd00335 nfsd00337 nfsiod00340 rpc.statd00342 rpc.lockd00439 prestoctl_svc00454 sendmail00473 xntpd00506 snmpd00507 inetd00509 os_mibs
5-38 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
00544 cron00570 lpd00579 lpd00580 rlogind00581 tcsh00583 smbd00587 nmbd00622 pwd_server00627 httpsd00637 httpsd00639 httpsd00640 httpsd00641 httpsd00642 httpsd00643 httpsd00658 erpcd00670 rarpd00684 asplmd.exe00705 dtlogin00718 getty00721 Xdec00726 dtlogin00780 dxconsole00781 dtgreet00808 rlogind00809 tcsh00842 httpsd00881 tcsh00883 lynx00887 httpsd00892 httpsd00898 telnetd00899 tcsh00918 pine00999 rpc.ttdbserverd01053 httpsd01057 httpsd01188 httpsd01201 httpsd01207 httpsd01236 httpsd01264 httpsd01342 httpsd01343 httpsd01344 httpsd01354 vi_kernel_process_status_end:_current_pid: 0_current_tid: 0xfffffc0000e0a840_proc_thread_list_begin:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nothread 0xfffffc0000e0ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]
Troubleshooting AdvFS 5-39
Case Study 3: Corruption and System Panic
thread 0xfffffc0000e0b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1adc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125298c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125282c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00046418c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c99080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c982c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc00121062c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecfb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]
5-40 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
thread 0xfffffc0013ecf080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecedc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013eceb00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece2c0 stopped at [thread_run:2438 ,0xfffffc00002a8898] So_proc_thread_list_end: warning: Files compiled -g3: parameter values probably wrong_dump_begin: 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/kpcpu = (nil)i = 6121648mycpu = 0spl = 0 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 5 level = 0 dmnh = 9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_setsfrag = 80630grpPg = 16fragPg = 4294967295fragHdrp = 0x76grpHdrp = 0xfffffc0008260000grpPgp = 0xfffffc0008260000fragPgp = 0x6setAttrp = (nil)pinPgH = struct { hndl = 396 dmnh = 0 pgHndl = 0}grpPgRef = struct { hndl = 5 dmnh = 9 pgHndl = 1}fragPgRef = struct { hndl = 5 dmnh = 0 pgHndl = 9} 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 50520 level = 77 dmnh = 128}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec]ftxH = struct { hndl = 5 level = 0
Troubleshooting AdvFS 5-41
Case Study 3: Corruption and System Panic
dmnh = 9}prevState = ACC_VALIDdelVdp = 0xfffffc00040ed008delList = 0xfffffc000031d884delCnt = 0delMCId = struct { cell = 12 page = 12}dmnp = 0xfffffc00040ed008cp = (nil)fragFlag = 1deleteIt = 0ftxFlag = 0vp = 0xfffffc00040ed008 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000bfap = 0xffffffff804dc528 7 bs_vfs_close(bfAccessH = 4716512) ["../../../../src/kernel/msfs/bs/bs_acces 8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44, 0xfffffc00110cf 9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000error = 0bverror = 59160000resp2 = 0xfffffc00110cf200piov = 0xfffffc000f555a48piovlen = 286061056psv = 0xffffffff80476db0minoffset = 8192maxoffset = 18446739675949101568first = 286061056imgathering = 0dupwrite = 60579840estale = 0prestohere = 0dummy = 0didmyreply = 0prevwr = 0x40000000000mywritelist = (nil)dev = 4716512pbp = 0xffffffff80476de8 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xfpsv = 0xffffffff80476db0vp = 0xfffffc000047f7e0 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kerndisp = 0xfffffc00005a5d20error = 0nreq = 0xfffffc000f65a180args = 0xfffffc00036a8e60 = " "res = 0xfffffc000f555a40 = ""ep = 0xfffffc00044ee680
5-42 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
fh = 0xfffffc00015a5700which = 7ret = 0dupstat = 2args_translated = 1psv = 0xffffffff80476db0count = 0buff = "" 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.cpsv = 0xffffffff80476db0xprt = 0xfffffc00015a5700savecr = 0xfffffc0013ecc100save_nd = 0xfffffc0013ecb158ip = 0xfffffc00015a5700uh = 0xfffffc0013ecc100ui = 0xffffffff80476db0len = 2n = 0x28000udp_in = struct { sin_len = ’^P’ sin_family = ’^B’ sin_port = 65027 sin_addr = struct { s_addr = 102291330 } sin_zero = ""} 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003m = 0xfffffc000254eb00thread = 0xfffffc0000e0a840psv = 0xffffffff80476db0_dump_end:_kernel_thread_list_begin:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nothread 0xfffffc0000e0ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc000412a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000e0a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1bb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1b080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1adc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1ab00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]
Troubleshooting AdvFS 5-43
Case Study 3: Corruption and System Panic
thread 0xfffffc0012d1a840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a2c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012d1a000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125298c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012529080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00125282c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012528000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc00046418c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0004641080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c99080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c982c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0000c98000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107b80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012107080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106dc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106b00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719thread 0xfffffc00121062c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0012106000 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecfb80 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf8c0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf600 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf340 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecf080 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ecedc0 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013eceb00 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece840 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece580 stopped at [thread_block:2066 ,0xfffffc00002a80d0]thread 0xfffffc0013ece2c0 stopped at [thread_run:2438 ,0xfffffc00002a8898] So_kernel_thread_list_end:_savedefp: (nil)_kernel_memory_fault_data_begin:struct {
fault_va = 0x0 fault_pc = 0x0 fault_ra = 0x0
5-44 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
fault_sp = 0x0 access = 0x0 status = 0x0 cpunum = 0x0 count = 0x0 pcb = (nil) thread = (nil) task = (nil) proc = (nil)}_kernel_memory_fault_data_end:_uptime: .27 hoursthread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source nopaniccpu: 0x0machine_slot[paniccpu]: struct { is_cpu = 0x1 cpu_type = 0xf cpu_subtype = 0x9 running = 0x1 cpu_ticks = { [0] 0x4708 [1] 0x0 [2] 0x1998d [3] 0xd55ad [4] 0x273a } clock_freq = 0x400 error_restart = 0x0 cpu_panicstr = 0xffffffff94ccb328 = "bad v1 frag free list" cpu_panic_thread = 0xfffffc0000e0a840}tset machine_slot[paniccpu].cpu_panic_thread:Begin Trace for machine_slot[paniccpu].cpu_panic_thread:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no> 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/k 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 0x5 level = 0x0 dmnh = 0x9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_sets 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 0xc558 level = 0x4d dmnh = 0x80}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec] 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000 7 bs_vfs_close(bfAccessH = 0x47f7e0) ["../../../../src/kernel/msfs/bs/bs_acce
8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44,0xfffffc00110cf
Troubleshooting AdvFS 5-45
Case Study 3: Corruption and System Panic
9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xf 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kern 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.c 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003End Trace for machine_slot[paniccpu].cpu_panic_thread:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no_stack_trace[0]_begin:> 0 boot(0x0, 0xfffffc0000e0a840, 0x2c0000002c, 0x31, 0xfffffc0000000001) ["../ 1 panic(s = 0xffffffff94ccb328 = "bad v1 frag free list") ["../../../../src/k 2 advfs_sad(0x76, 0x9000005, 0xfffffc0008260000, 0x13af6, 0xfffffc00005839d8) 3 bs_frag_alloc(setp = 0xfffffc00041e8c08, ftxH = struct { hndl = 5 level = 0 dmnh = 9}, fragId = 0xffffffff94ccb640) ["../../../../src/kernel/msfs/bs/bs_bitfile_sets 4 fs_create_frag(0xffffffff804dc528, 0xfffffc0012c1fc00, 0x9000005,0xfffffc0 5 close_one_int(bfap = 0xffffffff804dc528, parentFtxH = struct { hndl = 50520 level = 77 dmnh = 128}) ["../../../../src/kernel/msfs/bs/bs_access.c":3422, 0xfffffc00002ed6ec] 6 close_int() ["../../../../src/kernel/msfs/bs/bs_access.c":3179,0xfffffc000 7 bs_vfs_close(bfAccessH = 4716512) ["../../../../src/kernel/msfs/bs/bs_acces 8 msfs_inactive(0xfffffc0000357308, 0x12, 0xfffffc000042db44,0xfffffc00110cf 9 vrele(vp = 0xfffffc00110cf200) ["../../../../src/kernel/vfs/vfs_subr.c":230 10 rfs3_writeg() ["../../../../src/kernel/nfs/nfs3_server.c":2827,0xfffffc000 11 rfs3_write(args = 0xfffffc00036a8e60, resp = 0xfffffc000f555a40, nreq = 0xf 12 rfs_dispatch(req = (nil), xprt = 0xfffffc00015a5700) ["../../../../src/kern 13 nfs_rpc_recv(0xa305dc1600000007, 0xfffffc0000000001, 0xfffffc0013631a40,0x 14 nfs_rpc_input(0xfffffc0013631a40, 0xfffffc0000000024, 0x0,0xfffffc0013631b 15 nfs_input(m = 0xfffffc00034b5300) ["../../../../src/kernel/nfs/nfs_server.c 16 nfs_thread() ["../../../../src/kernel/nfs/nfs_server.c":5714,0xfffffc00003_stack_trace[0]_end:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0012106580 stopped at [stop_secondary_cpu:499,0xfffffc00004719warning: Files compiled -g3: parameter values probably wrong_stack_trace[1]_begin:> 0 stop_secondary_cpu(do_lwc = 1) ["../../../../src/kernel/arch/alpha/cpu.c":4 1 panic(s = 0xfffffc00005b1858 = "cpu_ip_intr: panic request") ["../../../../ 2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":629,0xfffffc00004
5-46 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
3 _XentInt(0x0, 0xfffffc00002a9c24, 0xfffffc00005d68b0, 0x3fff, 0x1) ["../../ 4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3302,0xfffffc000 5 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1930,0xfffffc0_stack_trace[1]_end:thread 0xfffffc0000e0a840 stopped at [boot:2412 ,0xfffffc000047a850] Source no"cpu_data" is not an arraythread 0xfffffc0012106840 stopped at [stop_secondary_cpu:499,0xfffffc00004719warning: Files compiled -g3: parameter values probably wrong_stack_trace[2]_begin:> 0 stop_secondary_cpu(do_lwc = 1) ["../../../../src/kernel/arch/alphacpu.c":4 1 panic(s = 0xfffffc00005b1858 = "cpu_ip_intr: panic request") ["../../../../ 2 cpu_ip_intr() ["../../../../src/kernel/arch/alpha/cpu.c":629, 0xfffffc00004 3 _XentInt(0x0, 0xfffffc00002a9be0, 0xfffffc00005d68b0, 0xfffffc0000200f00, 0 4 idle_thread() ["../../../../src/kernel/kern/sched_prim.c":3290,0xfffffc000 5 vm_page_tester() ["../../../../src/kernel/vm/vm_resident.c":1987,0xfffffc0_stack_trace[2]_end:
The stack trace fell through the NFS, VFS, and AdvFS code. The panic therefore did not isolate one of these subsystems as the cause of the problem.
7. Attempt to isolate the source of the problem.
a. Since the problem was not present before any upgrade was performed, revert to a known working configuration (DIGITAL UNIX Version 3.2g). The customer was able to continue to run booted from a DIGITAL UNIX Version 3.2g system disk without the AdvFS file corruption.
Since the problem appeared after an upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a, the problem was likely related to changes in source code between the two versions. At this point it was still unknown whether it was a DIGITAL UNIX, AdvFS, Presto, or NFS bug.
b. Time was spent with engineering to determine that the corruption and panics were not hardware related nor related to changes in the CAM driver.
c. A dd copy of the domain was copied from one of the corrupted domains to help in reproducing the corruption and panic.
d. The system was tested for corruption when running DIGITAL UNIX Version 4.0a with the Prestoserve option disabled to determine if the corruption was related to the Prestoserve option.
8. Elevate to engineering for the creation of a Prestoserve patch.
Once the problem was isolated to Prestoserve, an existing Prestoserve patch was available in the latest patch kit that fixed another problem.
It is always a good idea to install any related patches (in this case presto.mod) even if the patch README does not specifically mention this problem for a couple of reasons:
Troubleshooting AdvFS 5-47
Case Study 3: Corruption and System Panic
• The patch README does not always list all the things fixed by the patch.
• The patch may change the behavior of the problem and may provide additional information that can assist in troubleshooting.
So, although the patch README did not mention this problem, we had the customer install it to see if it might meet one of the two criteria above.
The engineer soon found that there was a locking problem in the Prestoserve module and produced a patch. The first version of this patch was not successful because it caused NFS to lock up. The second iteration of the patch proved to solve the problem.
Things Attempted: Case Study 31. Installed available AdvFS, Prestoserve option, NFS and Tru64 UNIX kernel
patches.
2. Upgraded to latest release of Tru64 UNIX.
3. Replaced the CPU due to some CPU exceptions reported in the binary errorlog file.
Final Solution/Summary: Case Study 3• The domains had to be remade and restored from backup.
• The customer system after problem resolution was running DIGITAL UNIX Version 4.0b with a new CPU, and a Prestoserve module kernel patch.
• The final solution was a Prestoserve module kernel patch provided by engineering.
Turning off the Prestoserve option was actually one of the first things that the specialist recommended to the customer. The customer refused to run this test unless we could prove to him beforehand that Prestoserve was causing the problem. He felt that the performance gained by using this option was a necessity on this system which acted as the main NFS server for the entire campus. The customer preferred to run at DIGITAL UNIX Version 3.2g, than to accept the performance degradation of turning off the Prestoserve option.
One reason the customer wanted to upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a was to obtain the performance improvement from the unfunneling of AdvFS in 4.0x. After approximately two months of ineffective troubleshooting, the customer was willing to try running while disabling the Prestoserve option. The file system corruption problem was resolved under this test situation indicating that the source of the corruption was the Prestoserve option in conjunction with AdvFS. The problem was subsequently isolated to the Prestoserve driver for DIGITAL UNIX Version 4.0x.
5-48 Troubleshooting AdvFS
Case Study 3: Corruption and System Panic
As you will recall from the stack trace, the panic did not include Prestoserve. Therefore, the panic was a symptom, not an indication that there was a problem in any of the subsystems that showed up in the trace.
Disabling the Prestoserve option did cause a loss of performance that the customer was displeased with. Once the Prestoserve patch was installed, they enabled the option and regained the performance.
Troubleshooting AdvFS 5-49
Using the salvage Utility
Using the salvage Utility
What is salvage?salvage is a new AdvFS utility available in the Tru64 UNIX Version 5.0 release. A field test version of salvage is available in DIGITAL UNIX V4.0D. (The latest versions of salvage for V4.0x and earlier releases of DIGITAL UNIX can be obtained from UNIX Support Engineering Group (USEG) at the File systems and Cluster Support Web page on sunny.alf.dec.com.
salvage can recover information at the block level from disks containing damaged AdvFS domains (that is, filesets cannot be mounted).
The syntax for the salvage utility is as follows:
/sbin/advfs/salvage [-x|-p] [-l] [-S] [-v number] [-d time] [-D directory] [-f archive] [-F format] [-L path] [-o option]{ -V special [-V special]... | domain } [fileset[path]]
The domain command specifies the name of an existing AdvFS file domain from which filesets are to be recovered. Use this parameter when you want the utility to obtain volume information from the /etc/fdmns directory. The volume information used by the utility consists of the device special file names of the AdvFS volumes in the file domain. When the domain parameter is specified without optional arguments, the utility attempts to recover the files in all filesets in the domain. Do not use this parameter when you want to use the -V special flag to specify device special file names of AdvFS volumes. If you do, the utility displays an error message and exits with an exit value of 2.
The fileset [path] command specifies the name of a fileset to be recovered from a domain or a volume. Specify path to indicate the path of a directory or file in a fileset. When you specify a path that is a directory, the utility attempts to recover only the files in that directory tree, starting at the specified directory. When you specify a path that is a file, the utility attempts to recover only that file. Specify path relative to the mount point of the fileset.
Table 5-4: salvage Options
Option Function
-d time Specifies the time, as a decimal number in this format: [[CC]YY]MMDDhhmm[.SS]When specified, salvage will recover only those files modified since this time.
-D directory Specifies the path of the directory to which all recovered files are written. If you do not specify a directory, the utility writes recovered files to the current working directory.
-f [archive] Use the next argument as the name of an archive. If "-", salvage writes to standard output.
-F format Specifies that salvage should recover files in an archive format. The only legitimate value is tar (currently V5.0).
5-50 Troubleshooting AdvFS
Using the salvage Utility
Operation
The salvage utility helps you recover file data after an AdvFS file domain has become unmountable due to some type of data corruption. Errors that could cause data corruption of a file domain include I/O errors in file system metadata or the accidental removal of a volume.
As the utility recovers files, it saves relevant information in memory. It requires enough disk space to save the recovered files plus the log file.
-l Specifies verbose mode for messages written to the log file for every file encountered during the recovery. If you do not specify this flag, the utility writes a message to the log file only for partially recovered and unrecovered files.
-L path Specifies the path of the directory or the file name for the log file you choose to contain messages logged by this utility. If you include a log file name in the path, the utility uses that file name. If no log file name is specified, the utility places the log file in the specified directory and names it salvage.log.pid (pid is the process ID of the user process). When you do not specify this flag, the utility places the log file in the current working directory and names it salvage.log.pid.
-o option Specifies the action the utility takes when a file being recovered already exists in the directory to which it is to be written. The values for option are:
• yes: Overwrites the existing file without querying the user. This is the default action when option is not specified.
• no: Does not overwrite the existing file.
• ask: Asks the user whether to overwrite the existing file.
-S Specifies that utility is to run in sequential search mode, checking each page on each volume in domain; takes a long time on large AdvFS file domains. This flag can recover most files from a domain damaged from an incorrect execution of the mkfdmn utility. In some cases, recovery must generate names based on file’s tag number. These cases usually happen in root directory, because mkfdmn usually overwrites this directory.
When you specify this flag, there may be a security issue because the utility could recover old filesets and deleted files.
-v number Specifies the type of messages directed to stdout. If you do not specify this flag, the default is to direct only error messages to stdout. If you specify number to be 1, both errors and the names of partially recovered files are directed to stdout. If you specify number to be 2, error messages and the status of all files as they are recovered are directed to stdout.
-V special [-V special]
Specifies the block device special file names of volumes in the domain, for example, /dev/disk/dsk3c. The utility attempts to recover files only from the volumes you specify. If you do not specify the -V flag, you must specify the domain parameter so that the utility can obtain the special file names of the volumes in the domain from the /etc/fdmns directory. Do not use this flag with the domain parameter. If you do, an error message is displayed and the utility exits with an exit value of 2.
-x Specifies that partially recoverable files are not to be recovered. If you do not use this flag, partial files are recovered.
Do not use the -x flag with the -p flag. If you do, the utility displays an error message and exits with an exit value of 2.
Table 5-4: salvage Options (Continued)
Troubleshooting AdvFS 5-51
Using the salvage Utility
ors file, more
t a hile
le
re
Running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered. Use the -l flag to print the status of all files that are encountered. There is a lost+found directory that lists files for which no parent directory can be found.
salvage places recovered files in directories named after the filesets. Recovered information must be moved to new filesets before you can remount the files as a fileset. You can specify the path name of the directory into which the files are recovered. If you do not specify a directory, the utility writes recovered files to the current working directory.
The results of using this utility can include some fully recovered files, partially recovered files, and unrecovered files. A partially recovered file is one that salvage could not obtain all of the file’s data due to corrupt metadata, I/O errreading the disk or missing volumes. If the partially recovered file is an ASCII the user may be able to fill in the missing data. Compaq cannot provide any recovery assistance of these types of files at this point.
The salvage utility opens and reads block devices directly and could presensecurity issue if it recovers data remaining from previous AdvFS file domains wattempting to recover data from current AdvFS file domains.
The salvage utility can be run in single user mode, without mounting other fisystems. The salvage utility is available from the UNIX Shell option when youare starting from the Tru64 UNIX operating system volume 1 CDROM.
You must have root user privilege to use the salvage utility.
salvage ExamplesThe following example shows a salvage command that uses all the defaults torecover all files from the AdvFS file domain named user_domain. Other results include:
• A log file named salvage.log.pid is written to the fixit directory.
• The files recovered from the user_domain are also written to the fixit directory.
• Partially recoverable files are included in the recovered files. These files awritten to the fixit directory.
# cd /fixit # /sbin/advfs/salvage user_domain
This example shows a salvage command that uses the -d option to recover all files in the domain user_domain that have been changed after that date.
# cd /fixit # /sbin/advfs/salvage -d 199611200000 user_domain
5-52 Troubleshooting AdvFS
Using the salvage Utility
The following example shows a salvage command that recovers the file data.file, whether or not it is only partially recoverable, from the fileset user_fileset on the volume mounted as /dev/disk/dsk3c. The data.file file is written to the recovery directory and is logged in the log file (only if it was partially recovered).
# cd /fixit # /sbin/advfs/salvage -V /dev/disk/dsk3c user_fileset/data.file
The following example shows a salvage command that recovers the file data.file, only if it is fully recoverable, from the fileset user_fileset on the domain user_domain. The data.file file, if it is not recovered, is logged in the log file. Otherwise, it is written to the recovery directory.
# cd /fixit # /sbin/advfs/salvage -x user_domain user_fileset/data.file
When to Use salvageUse salvage as a last resort to recover file data from a damaged file domain. Before using the salvage utility:
1. Repair domain structures using the verify utility.
2. Attempt to recover the fileset data from backup media if the verify utility does not solve the problem.
Only if both methods are unsatisfactory should you use the salvage utility.
Remember that running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered. Use the -l flag to print the status of all files that are encountered.
Since salvage may only be partially successful recovering the files, this should not be construed as a replacement for backups. Compaq recommends that regular backups be performed on any critical system data (as defined by the customer), and any corruption issues be dealt with by restoring any corrupted files from backups.
Using salvage in Conjunction with Backup MediaIn cases where the backup is not recent enough, salvage can be used in conjunction with the most recent backup to obtain current copies of files. These steps define how to perform this task:
1. Create a new file domain with the mkfdmn command.
2. Create new filesets and mount them.
3. Restore from backup to the new filesets.
Troubleshooting AdvFS 5-53
Using the salvage Utility
4. Run the salvage utility with the -d (date) flag set to recover files that have changed since the backup.
5. Move recovered files to new filesets. salvage places recovered files in directories named after the original filesets.
Using salvage in the Absence of Backup MediaIn cases where there is no backup media, salvage can be used, without the -d option, to recover all the filesets in the domain regardless of the date associated with the files.
After running salvage:
1. Mount new filesets.
2. Move recovered files to new filesets. salvage places recovered files in directories named after the filesets.
Using salvage in the Case of Very Large DomainsIn case of very large domains, you may want to recover one fileset at a time if there is not enough space to store an entire domain.
You could also output to tape using -F and -f (in tar format) if short on disk space.
Using salvage in the Case of Massive Metadata CorruptionIf previous executions of salvage indicated that significant portions of metadata could not be found or a domain has been destroyed by accidental use of the mkfdmn utility, you can use salvage with the -S flag to recover data.
Using the -S flag specifies the slowest, most complete disk search for data. The utility runs in sequential search mode, checking each page on each volume in the domain. This flag can be used to recover most files from a domain which has been damaged from an incorrect execution of the mkfdmn utility. In some cases, the recovery will generate names based on the file’s tag number. These cases usually happen in the root directory, because mkfdmn usually overwrites root directory metadata.
If a file is fully recovered but has lost its file name, the customer must try to find out the old name, use the name assigned by salvage or provide a new name based on the data in the file. This is similar to the process used to recover lost files in the UFS lost+found directory. If a file is only partially recovered, the customer must decide if there is any useful data in the file and reconstruct the file, or continue without it.
When you specify this flag, there may be a security issue because the utility could recover old filesets and deleted files.
5-54 Troubleshooting AdvFS
Summary
Summary
Describing AdvFS Troubleshooting PracticesThese troubleshooting practices were described:
• Describe the problem and any relevant circumstances surrounding the problem as much as possible.
• Check for hardware-related causes of the problem.
• Check specific locations for any error messages.
Check the advfs_err(4) reference page to find a brief description based on an error number.
• Search CANASTA if a panic is involved.
• If you think it might be a bug in the software, research the reported bugs and patches for potential similarities.
• Use system tools to check for problems.
• Use AdvFS tools and utilities to check for and to fix problems.
Troubleshooting File System CorruptionThe symptoms of a file system corruption a customer might report include:
• System panic
• Domain panic
• Corrupted data
• Unexpected behavior after entering ordinary commands on files in an AdvFS file system
Resolving Known AdvFS IssuesThese known issues were described in this section:
• Log half full
• BMT exhaustion
Troubleshooting AdvFS 5-55
Summary
Performing Case StudiesThese three case studies were described:
• A domain corruption problem
The problem statement received from the customer was:
"Following UNIX upgrade from 3.2a to 3.2c, multi-volume AdvFS domains will not mount."
The configuration the customer experienced the problem with consisted of two AlphaServer 2100 systems running DIGITAL UNIX Version 3.2a and DECsafe ASE Version 1.2
• A 1K fragment-free list corruption problem.
The problem statement received from the customer was:
"Following the creation of a new file on an existing AdvFS domain, it is noticed that many other files on the same domain now contain the same data."
The configuration that the customer experienced the problem with consisted of an AlphaServer 2100 running DIGITAL UNIX Version 3.2d.
• An AdvFS file system corruption problem and resultant system panic.
The problem statement received from the customer was:
"AdvFS file systems become corrupted following upgrade from DIGITAL UNIX Version 3.2g to DIGITAL UNIX Version 4.0a".
The configuration that the customer experienced the problem with consisted of an AlphaServer 2100 running DIGITAL UNIX Version 4.0a and the Prestoserve option. Prestoserve is a nonvolatile hardware cache used to speed up synchronous access to file systems. This option consists of a hardware card and a software driver and utilities. The system was the main NFS server for the campus.
Compaq (Seagate) disk drives were not used in the configuration.
Using salvage
salvage is a new AdvFS utility available in the Tru64 UNIX Version 5.0 release. (Versions of salvage for earlier versions of Tru64 UNIX can be obtained from the File systems and Clusters Support Web page at sunny.alf.dec.com.)
salvage can recover information at the block level from disks containing damaged AdvFS domains (that is, filesets cannot be mounted).
Use salvage as a last resort to recover file data from a damaged file domain. Before using the salvage utility:
1. Repair domain structures using the verify utility.
2. If the verify utility does not solve the problem, attempt to recover the fileset data from backup media.
5-56 Troubleshooting AdvFS
Summary
Only if both methods are unsatisfactory should you employ the salvage utility.
Remember that running the salvage utility does not guarantee that you will recover all your information. You may be missing files, directories, file names, or parts of files. The utility generates a log file that contains the status of files that were recovered.
Since salvage may only be partially successful recovering the files, this should not be construed as a replacement for backups. Compaq recommends that regular backups be performed on any critical system data (as defined by the customer), and any corruption issues be dealt with by restoring any corrupted files from backups.
Troubleshooting AdvFS 5-57
Exercises
Exercises
Describing AdvFS Troubleshooting PracticesThese questions provide a review of the material.
1. Which database should you search if there is a system panic involved with the problem?
2. Which AdvFS commands are new in Tru64 UNIX Version 5.0?
3. Why would you use the syscheck tool?
4. What are the common causes of AdvFS corruption?
5. In the case of generalized AdvFS corruption, what steps should you take to troubleshoot the problem?
6. Under what conditions should you use the salvage utility?
5-58 Troubleshooting AdvFS
Solutions
stem cks
Solutions
Describing AdvFS Troubleshooting Practices1. CANASTA is a Compaq internal crash dump analysis tool being used world-
wide inside Compaq to store and evaluate crash footprint information for OpenVMS Alpha, OpenVMS VAX and Tru64 UNIX system crashes. CANASTA uses AI technology to provide solutions or additional troubleshooting information for system crash problems. The CANASTA tool is typically used in the CSCs, but access to the CANASTA knowledge database is also available using the CANASTA Mail Server, TIMA STARS and COMET.
By using the AutoCLUE tool, customer crash dump information can be automatically sent to Compaq using DSNlink and will be analyzed using the DSNlink CLUE post-processor. Solution information, if available, can be automatically returned to the customer and/or included in the call handling system. (See http://hanhwr.hao.dec.com/CANASTA.HTML#CANASTA Overview for more information.)
2. These AdvFS commands are new in Tru64 UNIX Version 5.0:
3. The sys_check tool is a ksh script that can be useful when debugging or diagnosing system problems. The script generates an HTML file of a Tru64 UNIX configuration. This script has been tested on DIGITAL UNIX Version 3.2*, and Version 4.0 systems. (See http://www-unix.zk3.dec.com/tuning/tools/sys_check/sys_check.html.)
4. AdvFS corruption is usually caused by one of the following:
— Hardware problem
Hardware problems are the most common sources of AdvFS-related sypanics. One common cause of corruption in any file system is bad bloon the physical disk. Another common cause is outdated firmware revisions.
nvbmtpg Displays pages of an AdvFS BMT file.
nvfragpg Displays the pages of an AdvFS fragment file.
nvlogpg Displays the log file of an AdvFS file domain.
nvsbmpg Displays a page of the Storage BitMap (SBM) file.
nvtagpg Displays a page formatted as a tag file page.
salvage Recovers file data from damaged AdvFS file domains.
vdf Displays disk information for AdvFS domains and filesets.
Troubleshooting AdvFS 5-59
Solutions
shes
ally the be
hes
re ny
. In his
in.
— Uncontrolled system shutdown
AdvFS is generally robust enough to withstand unexpected system craor power outages, but may still cause corruption in certain cases.
— Software bugs in the AdvFS software
Software bugs can often be reproduced. AdvFS software bugs are usufixed by patches. Any available, relevant patches should be applied ininitial stages of troubleshooting a problem. Available resources shouldchecked for relevant patches since it is not always obvious which patcmight be relevant to AdvFS.
5. Possible troubleshooting actions for generalized corruption include:
— Check the binary errorlog for bad block replacements or other hardwaevents. If excessive, ensure the hardware problem is resolved before aother action.
— You can try adding volumes and removing the volumes having problemsthe case of general corruption, this will probably not solve the problem. Tprocess is time consuming with a large number of bad files.
6. Use salvage as a last resort to recover file data from a damaged file domaBefore using the salvage utility:
— Repair domain structures using the verify utility.
— If the verify utility does not solve the problem, attempt to recover thefileset data from backup media.
Only if both methods are unsatisfactory should you use the salvage utility.
5-60 Troubleshooting AdvFS
A
AdvFS Commands and Utilities
AdvFS Commands and Utilities A-1
About This Appendix
About This Appendix
IntroductionThis appendix describes a subset of the AdvFS command set and utilities that are particularly useful when working with AdvFS internals and troubleshooting. This appendix is intended to be used as an additional reference.
Topics:
• List of AdvFS commands
ResourcesFor more information on topics in this chapter as well as related topics, see the following:
• AdvFS Reference Pages
A-2 AdvFS Commands and Utilities
AdvFS Commands and Utilities
AdvFS Commands and Utilities
OverviewThis section describes a subset of the AdvFS commands and utilities that are particularly useful when troubleshooting configurations that include AdvFS and when learning about AdvFS internals. The command and utility information in this appendix is based on the Tru64 UNIX Version 5.0 (STEEL) code base.
Commands in Certain Versions of Tru64 UNIXBetween DIGITAL UNIX Version 3.2x, 4.0x, and the version 5.0 release of Tru64 UNIX, some AdvFS commands have been replaced or functionally modified. The following table lists some key commands that differ in different versions.
addvolDescription
The addvol command adds a volume to an existing file domain.
/usr/sbin/addvol [-F ] [-x num_pages] [-p num_pages] special domain
special specifies the block special device name of the disk that you are adding to the file domain. domain specifies the name of the file domain.
DIGITAL UNIX Version 3.2x Command
DIGITAL UNIX Version 4.0x Command
Tru64 UNIX Version 5.0 Command
msfsck verify verify
vchkdir verify verify
vods vbmtpg,
vbmtchain
nvbmtpg
no equivalent? vfragpg nvfragpg
logread vlogpg,
vlsnpg
nvlogpg
no equivalent? vtagpg nvtagpg
no equivalent salvage (field test version in DIGITAL UNIX 4.0D)
salvage
no equivalent no equivalent savemeta
no equivalent no equivalent vdf
no equivalent vfile vfilepg
no equivalent no equivalent vsbmpg
AdvFS Commands and Utilities A-3
AdvFS Commands and Utilities
Options
The flags -x numpages and -p numpages will be retired in a future release of the operating system. Users should migrate away from using these flags. These flags were necessary in previous releases to manipulate contiguous storage for bitfile metadata table (BMT) operations. In Tru64 UNIX Version 5.0, storage for BMT operations is managed internally by the operating system.
Operation
A newly created file domain consists of one volume, which can be a disk or a logical volume. The addvol utility enables you to increase the number of volumes within an existing file domain. You can add volumes immediately after creating a file domain, or you can wait until the filesets within the domain require additional space.
For optimum performance, each volume that you add should consist of the entire disk (typically, partition c). Existing data on the volume you add is destroyed during the addvol procedure. Do not add a volume containing data that you want to keep.
The addvol command checks for potential overlapping partitions before adding the volume. If you try to add a volume that would cause partitions to overlap with any other file systems, including Logical Storage Manager (LSM), UNIX file system (UFS), and AdvFS, or that would overlap with blocks currently in use, the following message is displayed and the volume is not added:
/dev/rdisk/dsk1g or an overlapping partition is open.Quitting ....addvol: Can’t add volume ’/dev/rdisk/dsk1g’ to domain ’proj_x’
If you try to add a volume that would cause partitions to overlap with other file systems, but none of the partitions are currently in use, you can choose to continue with the procedure or stop. Use the -F flag to disable testing for overlap. Disabling the overlap check can result in extensive data loss and should be used with extreme caution.
Adding volumes to a file domain does not affect the logical structure of the filesets within the file domain. You can add a volume to an active file domain while its filesets are mounted and in use. While up to 256 volumes per domain are allowed, limiting the number of volumes to three decreases the risk of disk errors that can cause the entire domain to become inaccessible.
-F Ignores overlapping partition or block warnings.
-x numpages Sets the number of pages by which the bitmap metadata table extent size grows. The default is 128 pages.
-p numpages Sets the number of pages to preallocate for the bitmap metadata table. The default is 0 pages.
A-4 AdvFS Commands and Utilities
AdvFS Commands and Utilities
The /etc/fdmns directory contains subdirectories named for the AdvFS domains defined on the system. Within each subdirectory there is a symbolic link (or more than one for multivolume domains) that points to the block device file that contains the data for the AdvFS volume. When you recreate a file domain (due to destruction of the information /etc/fdmns, you must rebuild the /etc/fdmns directory and any symbolic links within the domain subdirectory. It is good practice to maintain a hardcopy record of each volume you have since you must have the names of all the volumes in the domain to manually recreate the /etc/fdmns directory. You can use the advscan command to recreate the links for a file domain.
You cannot exceed 256 volumes per file domain. Also, you must have root user privilege to access this utility. The AdvFS Utilities license must be present for addvol to run.
AdvFS does not support a multivolume root file system. You cannot use the addvol utility to expand the root domain.
DIGITAL UNIX V4.x Specific Information for addvol( -x and -p will be retired)Systems with domains that contain very large numbers of files can use more metadata extents (similar to inodes in UFS) than normal. By default, AdvFS attempts to grow the bitmap metadata table (BMT) by 128 pages each time additional metadata extents are needed. Frequent requests by the system to increase the BMT causes the metadata to become very fragmented, which can result in an out of disk space error when there is actually space available.
You can reduce the amount of metadata fragmentation in one of two ways:
• Preallocating all of the space for the BMT when the volume is added
• Increasing the number of pages that the system attempts to grow the metadata table each time more space is needed.
To preallocate all the BMT space that you expect the file domain to need, use the mkfdmn command with the -p flag set to specify the number of pages to preallocate. Space that is preallocated to the BMT cannot be deallocated. Do not preallocate excessive space for the BMT. The following table provides BMT page number estimates for numbers of files.
To set the BMT to grow by more than 128 pages each time additional metadata extents are needed, use the addvol command with the -x flag set to specify a number of pages greater than 128. You can increase the number of pages to any value; the following table shows suggested guidelines.
Number of Files Extent Size (pages) Metadata Table Size (pages)
Less than 50,000 Default (128) 3,600
100,000 256 7,200
AdvFS Commands and Utilities A-5
AdvFS Commands and Utilities
To get the maximum benefit from increasing the number of metadata table extent pages, use the same number of pages when adding a volume with the addvol command as was assigned when the domain was created with the mkfdmn command.
advfsstatDescription
The advfsstat command displays AdvFS performance statistics.
/usr/sbin/advfsstat [options] [stats-type] domain/usr/sbin/advfsstat [options] -f 0 | 1 | 2 domain fileset
domain specifies the name of an existing domain. fileset specifies the name of an existing fileset.
200,000 512 14,400
300,000 768 21,600
400,000 1024 28,800
800,000 2048 57,600
Option Function
-i sec Specifies time interval (in seconds) between displays. advfsstat collects and reports information only for the specified interval. If sec is omitted, advfsstat uses a default interval of one second.
-c count Specifies the number of reports. For example, setting the advfsstat command flags -i 1 and -c 10 would produce 10 reports at 1 second intervals. If count is omitted, advfsstat returns one report.
-s Displays raw statistics for the interval.
-R Displays the percent ratio of the returned statistics. (Use only with -b, -p, or -r flags.)
Stats-types Function
-b Displays the buffer cache statistics for the selected domain.
-f 0 Displays all fileset vnop statistics for the selected fileset.
-f 1 Displays all fileset lookup statistics for the selected fileset
-f 2 Displays common fileset vnop statistics.
-l 0 Displays basic lock statistics.
-l 1 Displays lock statistics.
-l 2 Displays detailed lock statistics.
-n Displays namei cache statistics.
Number of Files Extent Size (pages) Metadata Table Size (pages)
A-6 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
The advfsstat command displays a wide selection of AdvFS performance statistics. It reports in units of one disk block (512 bytes) per interval with the default being one second. Any number of options (listed in the options table) may be used. The -R option may be specified only with the stats-types of -b, -p, and -r. The options -i and -c require parameters.
Only one stats-type (listed in the stats-type table) may be specified with the command. The -f, -l, -v, and -B stats-types require parameters. For the -f stats-type, the fileset parameter must also be specified.
advscanDescription
The advscan command locates AdvFS volumes (disk partitions or LSM disk groups) that are in AdvFS domains.
/sbin/advfs/advsan [-g] [-a] [-r] [-f domain_name] devices... disk_group...
devices specifies the device names of the disks to scan. disk_group specifies the Logical Storage Manager (LSM) disk groups to scan for AdvFS volumes.
Use the advscan command when you have moved disks to a new system, have moved disks around in a way that has changed device numbers or have lost track of where the domains are. The command is also used for repair if you delete the /etc/fdmns directory, delete a file domain directory in the /etc/fdmns directory, or delete links from a file domain directory under the /etc/fdmns directory.
-p Displays buffer cache pin statistics.
-r Displays buffer cache ref statistics.
-S Displays smoothsync queue statistics.
-v 0 Displays volume read/write statistics.
-v 1 Displays detailed volume statistics.
-v 2 Displays volume I/O queue statistics as a snapshot of everything currently on the queue.
-v 3 Displays volume I/O queue statistics for everything put on the queue during the last interval (-i).
-B r Displays BMT record read statistics.
-B w Displays BMT record write/update statistics.
AdvFS Commands and Utilities A-7
AdvFS Commands and Utilities
Options
Operation
The advscan command locates AdvFS volumes (disk partitions or LSM volumes) that are in AdvFS domains. Given the AdvFS volumes, you can recreate or fix the /etc/fdmns directory of a named domain or LSM disk group. For example, if you have moved disks to a new system, moved disks around in a way that has changed device numbers, or have lost track of where the AdvFS domains are, you can use this command to locate them.
Another use of the advscan command is to repair AdvFS domains when you have broken them. For example, if you mistakenly delete the /etc/fdmns directory, delete a domain directory in the /etc/fdmns directory, or delete links from a domain directory under the /etc/fdmns directory, you can use the advscan command to fix the problem.
The advscan command accepts a list of disk device names and/or LSM disk group names and searches all the disk partitions to determine which partitions are part of an AdvFS domain.
You can run the advscan command to rebuild all or part of your /etc/fdmns directory or you can rebuild it manually by supplying all the names of the AdvFS volumes in a domain.
If the -g flag is not set, the AdvFS volumes are listed as they are grouped in domains. Set the flag to list the AdvFS volumes in the order they are found on each disk.
Run the advscan command with the -r flag set to recreate missing domains from the /etc/fdmns directory, missing links, or the entire /etc/fdmns directory.
Although the advscan command will rebuild the /etc/fdmns directory automatically, Compaq recommends that you always keep a hardcopy record of the current /etc/fdmns directory.
To determine if a partition is part of an AdvFS domain, the advscan command performs the following functions:
• Reads the first two pages of a partition to determine if it is an AdvFS partition and to find the file domain information.
-g Lists the partitions in the order they are found on disk.
-a Scans all disks found in any /etc/fdmns domain as well as those in the command line.
-r Recreates missing domain. The domain name is created from the device names or LSM disk group names.
-f domain_name Fixes the domain count and the links for the named domain.
A-8 AdvFS Commands and Utilities
AdvFS Commands and Utilities
• Reads the disk label to sort out overlapping partitions. The size of overlapping partitions are examined and compared to the file domain information to determine which partitions are in the file domain. These partitions are reported in the output.
• Reads the boot block to determine if the partition is AdvFS root startable.
The advscan command displays the date the domain was created, the on-disk structure version, and the last known or current state of the volume.
To mount an AdvFS file system into a domain, the domain must be consistent. An AdvFS domain is consistent when the number of physical partitions or volumes with the correct domain ID are equal to both the domain volume count (which is a number stored in the domain) and the number of links to the partitions in the /etc/fdmns directory.
Domain inconsistencies can occur in diverse ways. Use the -f flag to correct domain inconsistencies. If you attempt to mount an inconsistent domain, a message similar to the following will appear on the console:
# Volume count mismatch for domain dmnz.dmnz expects 2 volumes, /etc/fdmns/dmnz has 1 links.
You must have root user privilege to access this command.
balanceDescription
The balance utility balances the percentage of used space among volumes in a domain.
/usr/sbin/balance [-v] domain
domain specifies the name of the file domain.
Options
Operation
The balance utility evenly distributes the percentage of used space between volumes in a multivolume domain. This improves performance and evens the distribution of future file allocations.
Use the showfdmn command to determine the percentage of used space on each volume. This information allows you to determine the need to balance the volumes.
The balance utility can be used at any time, but it is particularly useful after adding or removing a volume (addvol, rmvol) because these procedures can cause file distribution to become uneven.
-v Displays information on which files are being moved to different volumes. Selecting this flag slows down the balance procedure.
AdvFS Commands and Utilities A-9
AdvFS Commands and Utilities
When you plan to run both the defragment and balance utilities on the same domain, run the defragment utility before running the balance utility. The defragment utility often improves the balance of free space, this enabling the balance utility to run more quickly.
Before you can balance volumes in a file domain, all filesets in the file domain must be mounted. If you try to balance volumes in an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted.You must have root privilege to access this utility. The AdvFS utilities license must be present.
You cannot run the balance utility while the addvol, rmvol, defragment, rmfset, or balance utility is running on the same file domain. If you attempt to do this, a warning message is displayed.
The balance utility does not operate on striped files in the domain and does not include them in its calculations on used space.
chfileDescription
The chfile command lets you change attributes of an AdvFS file.
/usr/sbin/chfile [-l on | off] [-L on | off] filename ...
filename specifies one or more file names.
Options
Operation
The chfile command lets you view or change attributes of an AdvFS file. The only file attribute that can be set with the chfile command is the I/O mode used when write requests are made to the file. There are three settings for this I/O mode:
• Asynchronous I/O
The default setting. Write requests are cached, the write system call returns to the calling program, and later (asynchronously), the data is written to the disk.
-l on | off Enables or disables (on | off) forced synchronous write requests to the specified filename. By default, forced synchronous write requests to a file are off.
-L on | off Enables or disables (on | off) atomic write data logging on the specified filename. By default, atomic write data logging is turned off.
A-10 AdvFS Commands and Utilities
AdvFS Commands and Utilities
• Forced synchronous I/O
When this mode is enabled, write requests to a file behave as if the O_SYNC option had been set when the file was opened. The write system call returns a success value only after the data has been successfully written to disk.
• Atomic write data logging I/O
When this mode is enabled, write requests to a file are asynchronous. However, the write requests are also written to the AdvFS log file. Should a system crash during or after a write system call when this mode is enabled for the file, only complete write requests will be in the file on disk. This atomic operation guarantees that all (or none) of a write buffer will be in the file and that there will not be portions of the write request in the file. For example, suppose a write of an 8192-byte buffer was started and, during the write system call (or shortly thereafter) the system crashed. When the system was rebooted, either the entire 8192 bytes of data would be written to the file or none of it would have been written to the file. There would be no chance that some (but not all) bytes of the write request would be in the file.
The -l and -L options are mutually exclusive. You cannot simultaneously enable both forced synchronous writes and atomic write data logging on a file. However, you can override the current I/O mode for a file. For example, you can change a file’s I/O mode setting from forced synchronous writes to atomic write data logging by using the chfile -L on command.
If you do not use the options, the command displays the current state of the file’s I/O attribute.
Use the chfile command on AdvFS files that have been remotely mounted across NFS. You can run the chfile command on an NFS client to examine or change the I/O mode of AdvFS files on the NFS server.
Enabling atomic write data logging for a file will retard performance because the data is written to both the user file and the AdvFS log file. Enabling forced synchronous writes to a file also can retard system performance.
To use the chfile command on AdvFS files that are mounted across NFS, the NFS property list daemon, proplistd, must be running on the NFS client and the fileset must have been mounted on the client using the proplist option.
Only writes of up to 8192 bytes are guaranteed to be atomic for files that use atomic write data logging. When writing to an AdvFS file that has been mounted across NFS, a further restriction applies: the offset into the file of the write must be on an 8K page boundary, because NFS performs I/O on 8K page boundaries.
The showfile command does not display the I/O mode for files that are mounted across NFS. To display the I/O mode of these files, use the chfile command.
AdvFS Commands and Utilities A-11
AdvFS Commands and Utilities
Usually AdvFS, when operating on small files that do not have a size that is a multiple of 8K, puts the last part of the files (their frags) into a special metadata file called the fileset frags file as a way to reduce disk fragmentation. For example, a file that does not use atomic write data logging and has had 20K of data written to it will occupy 20K of disk space (as displayed by the du command).
Files that use atomic write data logging are exempt from this behavior. As a result, they always have a disk usage (as displayed by the du command) that is a multiple of 8K. For example, a file that has atomic write data logging enabled and has had 20K of data written to it occupies 24K of disk space.
If a file has a frag, an attempt to activate atomic write data logging on it will fail.
Files that use atomic write data logging cannot be memory-mapped through the mmap system call. The error ENOTSUP is returned if the attempt is made. If a file has been memory-mapped through the mmap system call, an attempt to activate atomic write data logging on it fails with the same error.
chfsetsDescription
The chfsets command enables you to change fileset quotas (file usage limits and block usage limits).
/sbin/chfsets [-F limit] [-f limit] [-B limit] [-b limit] domain [fileset...]
domain specifies the name of the file domain. fileset specifies the name of one or more filesets.
Options
Operation
Filesets can have both soft and hard disk storage and file limits. When a hard limit is reached, no more disk space allocations or file creations which would exceed the limit are allowed. The soft limit may be exceeded for a period of time (called the grace period). The grace periods for the soft limits are set with the edquota command.
The command also displays the changes made to the file and block usage limits.
- F limit Specifies the file usage soft limit (quota) of the fileset.
- f limit Specifies the file usage hard limit (quota) of the fileset.
- B limit Specifies the block usage soft limit (quota) in 1K blocks of the fileset.
- b limit Specifies the block usage hard limit (quota) in 1K blocks of the fileset.
A-12 AdvFS Commands and Utilities
AdvFS Commands and Utilities
The chfsets command displays the following fileset information:
• Id is a unique number (in hexadecimal format) that identifies a file domain and fileset.
• File H limit is the files usage hard limit of the specified fileset before the change followed by the new limit.
• Block H limit is the block usage hard limit of the specified fileset.
• File S limit is the file usage soft limit of the specified fileset before the change followed by the new limit.
• Block S limit is the block usage soft limit of the specified fileset before the change followed by the new limit.
At least one fileset within the domain must be mounted for the chfsets command to succeed. You must have root user privilege to access this command.
chvolDescription
The chvol command enables you to change the attributes of a volume in an active domain.
/sbin/chvol [-r blocks] [-w blocks] [-t blocks] [-c on | off] [-A] special domain
special specifies the block special device name, such as /dev/disk/dsk2c. domain specifies the name of the file domain.
Options
-r blocks Specifies the maximum number of 512-byte blocks that the file system reads from the disk at one time.
-w blocks Specifies the maximum number of 512-byte blocks that the file system writes to the disk at one time.
-t blocks Specifies the maximum number of dirty, 512-byte blocks that the file system will cache in-memory (per volume in a domain). Dirty means that the data has been written by the application but the file system has cached it in memory so it has not yet been written to disk.
Number of blocks must be in multiples of 16; valid range is 0-32768. Default (when a volume is added to a domain) is 768 blocks. For optimal performance, specify blocks in multiples of wblks (as specified by -w)
-c on | off Turns I/O consolidation mode on or off.
-A Activates a volume after an incomplete rmvol operation.
-l Displays range of I/O transfer sizes
AdvFS Commands and Utilities A-13
AdvFS Commands and Utilities
If a
Operation
The file system can consolidate a number of I/O transfers into a single, large I/O transfer. The larger the I/O transfer, the better the file-system performance. If you attempt to change the attributes of a volume in a domain that is not active an error message is produced.
The initial I/O transfer parameter for both reads and writes is 128 blocks. Once you change the I/O transfer parameters with the -r flag or the -w flag, the parameters remain fixed until you change them. The values for the I/O transfer parameters are limited by the device driver. Every device has a minimum and maximum value for the size of the reads and writes it can handle. If you set a value that is outside of the range that the device driver allows, the device automatically resets the value to the largest or smallest it can handle.
By default, the I/O consolidation mode (cmode) is on. The cmode must be on for the I/O transfer parameters to take effect. You can use the -c flag to turn the cmode off, which sets the I/O transfer parameter to one page.
For file system workloads that are heavily biased toward random writes, use the -t flag to increase the file system’s dirty threshold. This may improve write performance.
Interrupting an rmvol operation can leave the volume in an inaccessible state.volume does not allow new allocations after an rmvol operation, use the chvol command with the -A flag to reactivate the volume.
Using the chvol command without any flags displays the current cmode and the I/O transfer parameters.
The values for the wblks and rblks attributes are limited by the device driver.You must have root user privilege to access this command.
defragmentDescription
The defragment utility makes the files in a file domain more contiguous.
/usr/sbin/defragment [-e] [-n] [-N threads] [-t time] [-T time] [-v] [-V] domain
domain specifies the name of the file domain.
Options
-e Ignores errors and continues, if possible. Errors that are ignored are usually related to a specific file.
-n Prevents defragmentation from actually taking place. Use in conjunction with the -v flag to display statistics on the number of extents in the file domain.
A-14 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
When a file consists of many discontiguous file extents, the file is fragmented on the disk. File fragmentation reduces the read/write performance because more I/O operations are required to access a fragmented file.
The defragment utility attempts to reduce the number of file extents in a file domain by making files more contiguous. Defragmenting a file domain often makes the free space on a disk more contiguous, resulting in less fragmented file allocations in the future.
Before you can defragment a file domain, all filesets in the file domain must be mounted. If you try to defragment an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted.
To determine the amount of file fragmentation in a file domain before using the defragment utility, issue the defragment command with the -v and -n flags. This provides the fragmentation information without starting the defragment utility.
Before running the defragment utility, delete any files in the domain that you do not need. This gives the defragment utility more free space to use, which produces better results. Deleting files afterwards creates more free-space fragments. Additionally, run the balance utility on the domain before you run the defragment utility in order to balance domain free space before defragmenting the domain files.
To monitor the improvement made to the file domain by the defragment utility, use the verbose mode flag, -v, which displays the following information:
• The number of extents in the specified domain. (Contiguous extents in sparse files are counted as one extent after defragmentation, when in fact there are several contiguous file extents.)
-N threads Specifies the number of threads to run on the utility. The default number of threads that will be run is the number of volumes in the domain. The maximum number you can specify is 20.
-t time Specifies a flexible time interval (in minutes) for the defragment utility to run. If the utility is performing an operation when the specified time has elapsed, the procedure continues until the operation is complete.
-T time Specifies an exact time interval (in minutes) for the defragment utility to run. When the specified time has elapsed, the defragmentation procedure stops, even if it is performing an operation.
-v Displays statistics on the amount of fragmentation in the file domain and information on the progress of the defragment procedure.
-V Displays the same information provided by the -v flag along with information about each operation the defragment utility performs on each file. This flag slows the defragment procedure.
AdvFS Commands and Utilities A-15
AdvFS Commands and Utilities
• The number of files that have extents. (Note that files do not have extents if the files are so small that they are kept with the metadata.)
• The average number of extents for each file that has one or more extents.
• The efficiency of the entire file domain. An increase in value indicates improvement.
• The number of free-space fragments in the domain.
The defragment utility requires a minimum of 1 percent of the total space, or 5 megabytes per volume (whichever is less) to be free in order to run.
The defragment utility does not defragment striped files.
You cannot run the defragment utility while the addvol, balance, defragment, rmfset, or rmvol utility is running on the same file domain.
You must have root user privilege to access this utility.
logreadDescription
Replaced in V5 with the nvlogpg command.
migrateDescription
The migrate utility moves a file or file pages to another volume in an AdvFS domain.
/usr/sbin/migrate [-p pageoffset] [-n pagecount] [-s volumeindex] [-d volumeindex] filename
filename specifies the name of the file or file pages to be migrated from the volume. The file can be simple or striped.
Options
-p pageoffset Specifies the page offset of the first page to migrate. The first page of the file is page 0. The default page offset is 0. If you do not specify the -p flag, the migrate command migrates pages starting at page 0 of the file.
-n pagecount Specifies the number of pages to migrate, starting at the pageoffset value. The default pagecount is to EOF. If you do not specify the -n flag, the migrate command migrates pages from the pageoffset value to the end of the file.
A-16 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
The migrate utility moves the specified simple file to another volume in the same file domain. The utility also moves pages of a simple file or pages of a striped file segment to another volume (or volumes, if necessary) within the file domain.
Because there are no read/write restrictions when using this command, you can migrate a file while users are reading it, writing to it, or both, without disrupting file I/O. File migration is transparent to users.
When you run the migrate utility with only the -p and -n flags, the utility attempts to allocate the destination pages contiguously on one destination volume in the file domain. If there are not enough free, contiguous blocks to accomplish the move, the utility then attempts to allocate the pages to the next available blocks on the same volume. If there are not enough free blocks on the same volume, the utility then attempts to move the file to the next available volume or volumes. The utility returns an error diagnostic if it cannot accomplish the move.
You must use the -s, -n, and -p flags in order to move pages of a striped file from one volume to another. Only those pages assigned to the source volume are moved to the destination volume: all pages in the file are not moved.
You can use the migrate utility to move heavily accessed files or pages of files to a different volume in the file domain. Use the -d flag to indicate a specific volume. Also, you can use the utility to defragment a specific file, because the migrate utility defragments a file whenever possible.
You can only perform one migrate operation on the same file at the same time. When you migrate a striped file, you can only migrate from one source volume at a time.
You must have root user privilege to access this command.
-s volumeindex Specifies the volume index number of the volume from which the pages are to be migrated. Use the showfile -x command to determine the volume index number of the volumes in the AdvFS file domain.
If you specify the -s flag and the volume that contains the file does not contain any data extents of that file, the utility returns success without taking any action.
You must use the -s flag when you are migrating striped files. You can move pages of a striped file or a stripe file segment, which is the entire portion of a striped file that resides on the specified volume, to another volume.
-d volumeindex Specifies the volume index number of the volume to which the pages are to be migrated. You can determine the volume index number of the volumes in and AdvFS file domain by using the showfile -x command.
If you do not specify the -d flag, the file or file pages are moved to any volume or volumes with available space.
AdvFS Commands and Utilities A-17
AdvFS Commands and Utilities
mkfdmnDescription
The mkfdmn command creates a new AdvFS file domain.
/sbin/mkfdmn [-F] [-l num_pages] [-o] [-p num_pages] [-r] [-x num_pages] special domain
special specifies the block special device name, such as /dev/disk/dsk1c, of the initial volume that you use to create the file domain. domain specifies the name of the file domain.
Options
The flags -x numpages and -p numpages will be retired in a future release of the operating system. Users should plan to migrate away from use of these flags. The use of these flags was necessary in previous releases to manipulate contiguous storage for bitfile metadata table (BMT) operations. In Tru64 UNIX Version 5.0, storage for BMT operations is managed internally by the operating system using the RBMT.
Operation
The mkfdmn command creates a file domain, which is a logical construct containing both physical volumes (disks or disk partitions) and filesets. When you create a file domain, you must specify one volume. If the new file domain will overlap mounted file systems, swap areas, or reserved partitions, you are given the choice of continuing or aborting the command.
Existing data on the volume you assign to a new file domain is destroyed when the file domain is created.
-F Ignores overlapping partition or block warnings.
-l numpages Sets the number of pages in the log file. AdvFS rounds this number up to a multiple of four.
-o Overwrites an existing file domain, allowing you to recreate the domain structure.
-p numpages Sets the number of pages to preallocate for the bitfile metadata table (BMT). The default is 0 (zero) pages.
-r Specifies the file domain as the root domain. This prevents multiple volumes in the root domain. AdvFS supports only one volume in the root domain.
-x numpages Sets the number of pages by which the bitfile metadata table (BMT) extent size grows. The default is128 pages.
-V3 | -V4 Specifies the nature of the on-disk data structures. Default is V4. The AdvFS domain version number for Tru64 UNIX V5 is 4.
A-18 AdvFS Commands and Utilities
AdvFS Commands and Utilities
If you try to add a volume that would cause partitions to overlap with any other file system, including LSM, UFS, and AdvFS, or that would overlap with blocks that are in use, the system displays a message asking if you wish to continue. Using the -F flag disables testing for overlap.
The mkfdmn command does not create a file system that you can mount. In order to mount a file system, the file domain must contain one or more filesets. After you run the mkfdmn command, you must run the mkfset command to create at least one fileset within the new file domain. You can access the file domain as soon as you mount one or more filesets. For more information about creating filesets, see mkfset(8).
To remove a file domain, dismount all filesets in the domain you want to remove. Then use the rmfdmn command to remove the file domain. You can also remove the definition of the domain by removing the defining directory and all links under it in the /etc/fdmns directory. To accomplish this, execute the following command line:
# rm -rf /etc/fdmns/domain_name
Although you can use the advscan command to recreate the file domain links, it is good practice to maintain a current hardcopy record of each volume you have. You must have the names of all the volumes in the domain to recreate the /etc/fdmns directory by hand.You must have root user privilege to use the mkfdmn command.
You cannot have more than 100 active file domains at one time. A file domain is active when at least one fileset is mounted.
Each file domain must have a unique name of up to 31 characters. All whitespace characters (tab, line feed, space, and so on) and the / # : * ? characters are invalid for file domain names.
DIGITAL UNIX V4.x Specific mkfdmn InformationSystems with file domains that contain very large numbers of files can use more BMT extents (similar to inodes in UFS) than normal. By default, AdvFS attempts to grow the BMT by 128 pages each time additional BMT extents are needed. Frequent requests by the system to increase the BMT cause the metadata to become very fragmented, which can result in an out of disk space error.
You can reduce the amount of metadata fragmentation in one of two ways: increasing the number of pages the system attempts to grow the BMT each time more space is needed or by preallocating all of the space for the BMT when the file domain is created.
AdvFS Commands and Utilities A-19
AdvFS Commands and Utilities
To preallocate all of the BMT space you expect the file domain to need, use the mkfdmn command with the -p flag set to specify the number of pages to preallocate. Space that is preallocated for the BMT cannot be deallocated, so do not preallocate more space than you need for it. The following table provides BMT page number estimates for numbers of files.
To set the BMT to grow by more than 128 pages each time additional metadata extents are needed, use the mkfdmn command with the -x flag set to specify a number of pages greater than 128. You can increase the number of pages to any value; the following table shows suggested guidelines.
If you make a file domain using the -p or -x flags to increase the BMT extent allocations, you must use the same flag with the same number of pages when you add a volume to the file domain with the addvol command. See addvol(8) for information about adding a volume to a file domain.
Use a value in the -x num_pages argument that maintains the following ratio between the BMT extent size (the number of pages for the -x parameter) and the log file size (the number of pages for the -l parameter):
BMT extent size <= (log file size * 8184) / 4
It takes about one minute to process 5000 BMT extent size pages with the -x flag. A process that initiates a BMT extent size operation must take into account that very large values for -x will take a long time to complete.
mkfsetDescription
The mkfset command creates an AdvFS fileset within an existing domain.
/sbin/mkfset domain fileset
domain specifies the name of an existing AdvFS file domain. fileset specifies the name of the fileset to be created in the specified file domain.
Number of Files Suggested BMT Extent (pages)
BMT Size (pages)
Less than 50,000 default (128) 3,600
100,000 256 7,200
200,000 512 14,400
300,000 768 21,600
400,000 1024 28,800
800,000 2048 57,600
A-20 AdvFS Commands and Utilities
AdvFS Commands and Utilities
t in the t. For
Operation
You must create at least one fileset per file domain; however, you can create multiple filesets within a file domain. You can mount and unmount each fileset independently of the other filesets in the file domain. You can assign fileset quotas (block and file usage limits) to filesets. You must have root user privilege to use this utility.
Each fileset within a domain must have a unique name of up to 31 characters. All whitespace characters (tab, new line, space and so on) and the / # : * ? characters are invalid for fileset names.
Tru64 UNIX supports an unlimited number of filesets per system; only 512 filesets can be mounted at one time.
mountlistDescription
The mountlist command checks for mounted AdvFS filesets.
/sbin/advfs/mountlist [-v]
Options
Operation
The mountlist command is used by the setld -d function. The /usr.smdb./OSFADVFS***.scp routine calls this command to check for mounted filesets before proceeding with the installation.
The exit status from mountlist is 0 if no mounted AdvFS filesets are found. An exit status of 1 indicates either an error occurred or mounted AdvFS filesets were found. You must have root user privilege to use this utility.
ncheckDescription
The ncheck command lists i-number or tag and path name for all files in a file system.
/usr/sbin/ncheck [-i numbers] [-asm] filesystem
filesystem specifies one or more file systems. Specify any file system by entering its full path name. The full path name is the file system’s mount pointhe /etc/fstab file. You can also specify a UFS file system by entering thename of its device special file. You can specify an AdvFS fileset by enteringname of the file domain, a pound sign (#) character, and the name of the fileseexample: root_domain#root.
-v Prints a list of the mounted filesets
AdvFS Commands and Utilities A-21
AdvFS Commands and Utilities
Options
Operation
The ncheck command with no flags generates a list of all files on every specified file system. The list includes the path name and the corresponding i-number or tag of each file. Each directory file name in the list is followed by a /. (slash dot). Use the available flags to customize the list to include or exclude specific types of files.
The files are listed in order by i-number or tag. To sort the list in a more useful format, pipe the output to the sort command.You must have root user privilege to access this command.
The ncheck command checks the /etc/fstab file for the specified domain and file system entry. If there is no entry in /etc/fstab for the specified file system, an error message is displayed to indicate that the file does not exist.
nvbmtpgDescription
The nvbmtpg command displays pages of an AdvFS BMT file. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vbmtpg and vbtmchain commands provided in earlier releases.
/sbin/advfs/nvbmtpg [-R] [-v] { domain_id | bmt_id } [-f]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id page [-f]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id page mcell [-c]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id [-a]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id fileset_id [ file_id ] [-c]/sbin/advfs/nvbmtpg [-R] [-v] bmt_id -s b block/sbin/advfs/nvbmtpg [-R] [-v] domain_id fileset_id -s f frag/sbin/advfs/nvbmtpg [-R] [-v] volume_id -b block [ mcell]/sbin/advfs/nvbmtpg [-R] volume_id -d dump_file
-a Includes in the list the path names . (dot) and .. (dot dot), which are ordinarily suppressed.
-i numbers Lists only those files with the specified i-numbers (UFS) or tags (AdvFS).
-m Includes in the list the mode, UID, and GID of the files. To use this flag you must also specify the -i or the -s flag on the command line.
-s Lists only the special files and files with set-user-ID mode.
A-22 AdvFS Commands and Utilities
AdvFS Commands and Utilities
bmt_id specifies the BMT file on an AdvFS volume or a BMT file that has been saved as a dump_file. Use the following format if you want to specify a dump file: volume_id | [-F] dump_file
Specify the -F flag to force the command to interpret the name you supply as a file name. domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to indicate the domain argument is to be used as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
volume specifies the name of an AdvFS volume in an AdvFS file domain. volume_index specifies the index number of a volume in an AdvFS file domain. Specify the -V flag to indicate the volume argument is to be used as the volume name. The volume argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
fileset_id specifies an AdvFS fileset using the following format:
[-S] fileset | -T fileset_tag
Specify the -S flag to indicate the fileset argument is to be used as the fileset name. Specify the fileset by entering either the name of the fileset, fileset, or the fileset’s tag number, -T fileset_tag.
file_id specifies a file name in the following format:
file | [-t] file_tag
Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.
dump_file specifies the name of a file that contains the output from this utility. mcell specifies the number of a metadata cell (mcell) from a file. page specifies the file page number of a file.
AdvFS Commands and Utilities A-23
AdvFS Commands and Utilities
Options
Operation
The nvbmtpg utility formats, dumps, and displays pages of the bitfile metadata table (BMT) files. BMTs are composed of mcells. Each file in an AdvFS domain is described by a collection of mcells. The mcells for each file are chained together. The first mcell in a chain is called the primary mcell.
AdvFS creates one BMT file for each AdvFS volume in an AdvFS file domain. AdvFS first creates a BMT file on the volume you specify when you run the mkfdmn utility to create an AdvFS file domain. As you add volumes to the domain with the addvol utility, a BMT file is created on each added volume.
-a Specifies that all the pages in the BMT be displayed.
-b block Specifies the logical block number of a disk block on an AdvFS volume.
-c Displays the entire chain of mcells.
-d dumpfile Specifies the name of a file that will hold the contents of the specified BMT file.
-F Forces the command to interpret the name you supply as a file name.
-f Displays the number of free mcells.
-l Displays the deferred delete list of mcells.
-R Specifies that information about the RBMT is to be displayed.
-s b block Specifies logical block number of a disk block on AdvFS volume. When you use this flag, the utility searches the specified BMT file for a mcell that has an extent record for a file that contains the specified block.
-s f frag Specifies the number of a file fragment in the frag file for a fileset. When you use this flag, the utility searches all BMT files (there is one on each AdvFS volume) for a mcell that:
• Belongs to a file in the specified fileset
• Has an attribute record that indicates the file is using the specified frag ID.
-s t tag Specifies the file tag number. The utility searches one or all of the BMT files for a mcell with this tag.
-T fileset_tag Specifies the tag number for a fileset.
-t file_tag Specifies the tag number for a file.
-v Displays all the data in a specified mcell.
volume Specifies the name of an AdvFS volume in an AdvFS file domain.
volume_index Specifies the index number of a volume in an AdvFS file domain.
A-24 AdvFS Commands and Utilities
AdvFS Commands and Utilities
The BMT file for a volume never migrates from the volume. When you remove a volume from a domain, the BMT file on the removed volume is, like the volume itself, no longer accessible.
A BMT file is an array of 8 Kbyte file pages, each page containing a header and an array of metadata cells (mcells). The purpose of a BMT file is to contain all the metadata for all files that are stored on an AdvFS volume.
You can use this command to:
• Display a summary of the BMT on one AdvFS volume or a summary of all the BMT files (there is one per volume) in a domain.
• Display a page of mcells or one mcell or a chain of mcells. The page can be specified by BMT page number or volume block number. An mcell can be specified by a number or by specifying the primary mcell of a file.
• Search for a mcell based on an extent that maps a volume block or a file that uses a given frag ID.
See nvbmtpg(8) for more information.
It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvbmtpg utility fails in this situation, run it again.
nvfragpgDescription
The nvfragpg command displays the pages of an AdvFS frag file. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vfragpg command provided in earlier releases.
/sbin/advfs/nvfragpg [-v] [-f] frag_id/sbin/advfs/nvfragpg [-v] [-f] frag_id page/sbin/advfs/nvfragpg volume_id -b block/sbin/advfs/nvfragpg [-v] [-f] domain_id fileset_id -d dump_file
frag_id specifies a frag file using the following format:
domain_id fileset_id | [-F] dump_file
AdvFS Commands and Utilities A-25
AdvFS Commands and Utilities
The dump_file is a previously-saved copy of a frag file. Use the -F flag to force the utility to interpret the dump_file as a file name when it has the same name as a domain name.
domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path name, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
fileset_id specifies an AdvFS fileset using the following format:
[-S] fileset | -T fileset_tag
Specify the -S flag to force the command to interpret the name you supply as a fileset name. Specify the fileset by entering either the name of the fileset, or the fileset’s tag number, -T fileset_tag.
file_id specifies a file name in the following format:
file | [-t] file_tag
page specifies the file page number of a file.
Options
Operation
Use the nvfragpg utility to display information about frag file metadata.
-b block Specifies logical block number of a disk block on an AdvFS volume.
-d dumpfile Specifies the name of a file that contains the output of this utility.
-f Displays the frag file free list.
-v Displays all the data in a frag file.
A-26 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Each fileset in an AdvFS domain has one frag file. Frag files are collections of file fragments. The collections of file fragments in a frag file are called groups, because the file fragments are grouped by file fragment size: file fragments of 1 Kbyte or less are collected in one group; file fragments more than 1 Kbyte up to 2 Kbytes are collected in another group; and so on, up to a group that contains file fragments that are more than 7 Kbytes up to 8 Kbytes.
The first 1024 bytes of each group in a frag file contains the metadata for the file fragments in the group. A group is never larger than 128 Kbytes, so a group that collects 1 Kbyte fragments can hold at most 127 fragments, a group that collects 2 Kbyte fragments can hold at most 63 fragments, and so on. A group that collects 8 Kbyte fragments can hold at most 15 fragments.
You can use the nvfragpg command to:
• Display a summary
• Display a single frag file page
• Display corrupted volumes
• Save a frag file
For more information, see nvfragpg(8).
It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvfragpg utility fails in this situation, run it again.
nvlogpgDescription
The nvlogpg command displays the log file of an AdvFS file domain. This command is new in Tru64 UNIX V5.0 and should be used in place of the vlogpg and vlsnpg commands provided in DIGITAL UNIX Version 4.0x releases and in place of the logread command in DIGITAL UNIX Version 3.2x releases.
/sbin/advfs/nvlogpg log_id/sbin/advfs/nvlogpg [-v | -B] log_id page [record_offset [-f]]/sbin/advfs/nvlogpg [-v | -B] log_id [-R | -a ]/sbin/advfs/nvlogpg [-v | -B] log_id [-R | -a] page_offset/sbin/advfs/nvlogpg domain_id | volume_id -d dump_file/sbin/advfs/nvlogpg [-v | -B] volume_id -b block
AdvFS Commands and Utilities A-27
AdvFS Commands and Utilities
log_id specifies a log file in an AdvFS domain or a log file that has been saved by the utility as a dump_file. Use the following format:
domain_id | volume_id [-F] dump_file
Specify the -F flag to force the utility to interpret the name you supply as a file name.
domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
dump_file specifies the name of a file that contains the output from this utility.
page specifies the file page number of a file. page_offset specifies the offset in the log file. record_offset specifies a byte offset in a page of the log file.
Options
-a Specifies that all the pages in the log file be displayed.
-B Specifies that only the transaction id for each log file entry be displayed.
-b block Specifies the logical block number of a disk block on an AdvFS volume.
-d dump_file Specifies the name of a file that will hold the contents of the specified log file.
-e Specifies that the last active record in the log file is to be displayed.
-f Specifies that all subtransactions of the parent transaction are to be followed.
-s Specifies that the first active record in the log file (the start of the log file) is to be displayed.
-v Displays all the data in a specified log.
A-28 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
The nvlogpg command locates the log file of an AdvFS file domain and displays records from it in various ways.
The log file for a domain is a bitfile, organized as an array of 8KB disk pages. Each page consists of a fixed-size header record, a number of variable-sized data records, and a variable-sized trailer record. Each data record consists of a fixed-size header and a variable amount of data.
The log file for a domain contains the metadata, the log, of each transaction. Before a transaction is written to disk, its logged metadata is written to disk. Because the log of a transaction contains the information necessary to redo the transaction, the file system can maintain consistency on disk and recover from transaction failures when they occur. These transactions and the metadata they include are used to replay transactions that did not complete, for example if the system crashed, when the domain is next activated.
Using this command you can:
• Display a summary
• Display log file pages and records
• Save and examine the log file
It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the nvlogpg utility fails in this situation, run it again.
nvtagpgDescription
The nvtagpg command displays a page formatted as a tag file page. This command is new in Tru64 UNIX V5.0. This command should be used in place of the vtagpg command provided in earlier releases.
/sbin/advfs/nvtagpg [-v] tag_id/sbin/advfs/nvtagpg [-v] tag_id | {page | -a}/sbin/advfs/nvtagpg [-v] fileset_id file_id/sbin/advfs/nvtagpg domain_id fileset_id -d dump_file/sbin/advfs/nvtagpg domain_id -d dump_file/sbin/advfs/nvtagpg volume_id -b block
AdvFS Commands and Utilities A-29
AdvFS Commands and Utilities
ou
of
a full
a et, or
tag_id specifies a tag file using the following format:
roottag_id | fileset_id
The roottag_id parameter specifies the root tag file using the following format:
domain_id | [-F] dump_file
The dump_file parameter is a previously-saved copy of the fileset’s tag file. Ycan use the -F flag to force the utility to interpret the dump_file parameter as afile name if it has the same name as a domain name.
filesettag_id specifies a fileset tag file using the following format:
domain_id fileset_id | [-F] dump_file
domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
Specify the -r flag to operate on the raw device (character device special file)the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can beor partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
fileset_id specifies an AdvFS fileset using the following format:
[-S] fileset | -T fileset_tag
Specify the -S flag to force the command to interpret the name you supply as fileset name. Specify the fileset by entering either the name of the fileset, filesthe fileset's tag number, -T fileset_tag.
file_id specifies a file name in the following format:
[-F] file | [-t] file_tag
A-30 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Specify the -F flag to force the command to interpret the name you supply as a file name. Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.
page specifies the file page number of a file.
Options
Operation
The nvtagpg utility displays formatted pages of a root tag file or a fileset tag file. The utility can also save a copy of a tag file.
Each AdvFS domain has a root tag file that lists all the filesets in the domain. Each fileset has a tag file that lists all the files in the fileset.
Use the nvtagpg command to:
• Display a root tag file
• Display a fileset tag
• Save the tag file
• Display corrupted AdvFS volumes
It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd is running. The daemon, as it runs, activates the domain for a brief time. If the nvtagpg utility fails in this situation, run it again.
-a Specifies that all the pages in the file be displayed.
-b block Specifies the logical block number of a disk block on an AdvFS volume.
-d dump_file Specifies the name of a file that will hold the contents of the specified tag file.
-v Displays all the data in a specified tag file.
AdvFS Commands and Utilities A-31
AdvFS Commands and Utilities
rmfdmnDescription
The rmfdmn command removes a file domain.
/sbin/rmfdmn [-f] domain
domain specifies the name of an existing file domain.
Options
Operation
The rmfdmn utility enables you to remove an unused file domain. Before you can remove a file domain, unmount all filesets and clone filesets from the domain using the umount command. If you try to remove a file domain that has mounted filesets, the system displays an error message indicating that a fileset is mounted. AdvFS will not remove the file domain.
The -f flag is useful for scripts when you do not want to be queried for each file domain. If you choose the -f flag, no message prompt will display. The rmfdmn command will operate as if you responded yes to the prompt.
You must have root user privilege to use this command.
You must update the /etc/fdmns directory to delete the file domain entry for the deleted file domain.
rmfsetDescription
The rmfset command removes a fileset or a clone fileset from an AdvFS file domain.
/sbin/rmfset [-f] domain fileset
domain specifies the name of an existing AdvFS file domain. fileset specifies the name of the fileset to be removed from the specified file domain.
Options
Operation
The rmfset command removes a fileset (and all of its files) from an existing AdvFS file domain.
-f Turns off the message prompt
-f Turns off the message prompt
A-32 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Unmount the fileset before removing it with the rmfset command. A fileset or clone fileset cannot be removed with this command if it is mounted. A fileset that has a clone fileset cannot be removed with this command until the clone fileset has been removed.
The -f flag is useful for scripts or when you do not want to be queried about each fileset. If you choose the -f flag, no prompts are displayed.
You must have root user privilege to use this command.
rmvolDescription
The rmvol command removes a volume from an existing AdvFS file domain.
/usr/sbin/rmvol [-f][-v] special domain
special specifies the block device special file name, such as /dev/disk/dsk2c, of the volume that you are removing from the file domain. domain specifies the name of an existing AdvFS file domain.
Options
Operation
The rmvol utility enables you to decrease the number of volumes within an existing file domain. When you attempt to remove a volume, the file system automatically migrates the contents of that volume to another volume in the file domain.
The logical structure of the filesets in a file domain is unaffected when you remove a volume. If you remove a volume that contains a stripe segment, the rmvol utility moves the segment to another volume that does not already contain a stripe segment of the same file. If a file is striped across all volumes in the file domain, the utility requests confirmation before placing a second stripe segment on a volume that has one.
Before you can remove a volume from a file domain, all filesets in the file domain must be mounted. If you try to remove a volume from an active file domain that includes unmounted filesets, the system displays an error message indicating that a fileset is unmounted. This message is repeated until you mount all filesets in the file domain.
If you attempt to remove a volume from an inactive file domain, the system returns the ENO_SUCH_DOMAIN error message. A file domain is inactive when none of its filesets is mounted. In this case, the rmvol command does not remove the volume.
-f Turns off the message prompt.
-v Displays messages that describe which files are moved off the specified volume. Using this flag slows the rmvol process.
AdvFS Commands and Utilities A-33
AdvFS Commands and Utilities
If there is not enough free space on other volumes in the file domain to accept the offloaded files from the departing volume, the rmvol utility moves as many files as possible to free space on other volumes. Then a message is sent to the console indicating that there is not enough space to complete the procedure. The files that were not yet moved remain on the original volume.
You can interrupt the rmvol process without damaging your file domain. AdvFS will stop removing files from the volume. Files already removed from the volume will remain in their new location. Interrupting a rmvol operation with the kill command can leave the volume in an inaccessible state. If a volume does not allow new allocations after an rmvol operation, use the chvol command with the -A flag to reactivate the volume.
You cannot run the rmvol utility while the defragment, balance, rmfset, or rmvol utility is running on the same domain.
You must have root user privilege to use this utility.
salvageDescription
The salvage command recovers file data from damaged AdvFS file domains. This is a new command in the Tru64 UNIX Version 5.0 release. (A field test version of salvage is in the DIGITAL UNIX Version 4.0D release.)
/sbin/advfs/salvage [-x|-p] [-l] [-S] [-v number] [-d time] [-D directory] [-L path] [-o option] { -V special [-V special]... | domain } [fileset[path]]
domain specifies the name of an existing AdvFS file domain from which filesets are to be recovered. Use this parameter when you want the utility to obtain volume information from the /etc/fdmns directory. The volume information used by the utility consists of the device special file names of the AdvFS volumes in the file domain. When the domain parameter is specified without optional arguments, the utility attempts to recover the files in all filesets in the domain.
Do not use this parameter when you want to use the -V special flag to specify device special file names of AdvFS volumes. If you do, the utility displays an error message and exits with an exit value of 2.
fileset [path] specifies the name of a fileset to be recovered from a domain or a volume.
Specify path to indicate the path of a directory or file in a fileset. When you specify a path that is a directory, the utility attempts to recover only the files in that directory tree, starting at the specified directory. When you specify a path that is a file, the utility attempts to recover only that file. Specify path relative to the mount point of the fileset.
A-34 AdvFS Commands and Utilities
AdvFS Commands and Utilities
ch
pen in
over
lue
rites
ected
he
the
lag,
Options
-d time Specifies the time, as a decimal number in this format: [[CC]YY]MMDDhhmm[.SS]
-D directory Specifies the path of the directory to which all recovered files are written. If you do not specify a directory, the utility writes recovered files to the current working directory.
-l Specifies verbose mode for messages written to the log file for every file that is encountered during the recovery. If you do not specify this flag, the utility writes a message to the log file only for partially recovered and unrecovered files.
-L path Specifies the path of the directory or the file name for the log file you choose to contain messages logged by this utility. If you include a log file name in the path, the utility uses that file name. If no log file name is specified, the utility places the log file in the specified directory and names it salvage.log.pid (PID is the process ID of the user process). When you do not specify this flag, the utility places the log file in the current working directory and names it salvage.log.pid.
-o option Specifies the action the utility takes when a file being recovered already exists in the directory to which it is to be written. The values for option are:
• yes, overwrite the existing file without querying the user. This is the default action when option is not specified.
• no, do not overwrite the existing file.
• ask, ask the user whether to overwrite the existing file.
-p Specifies that the utility identifies a partially covered file by appending a ’.partial’ to its file name.
-S Specifies that the utility is to run in sequential search mode, checking each page on eavolume in the domain. This mode of operation will take a long time on large AdvFS filedomains. This flag can be used to recover most files from a domain which has been damaged from an incorrect execution of the mkfdmn utility. In some cases, the recovery will need to generate names based on the file's tag number. These cases usually hapthe root directory, because mkfdmn usually overwrites this directory.
When you specify this flag, there may be a security issue, because the utility could recold filesets and deleted files.
-F format Specifies that salvage should recover files in an archive format. The only legitimate vafor format is ’tar’.
-f [archive] Salvage uses the next argument as the name of an archive. If the name is ’-’, salvage wto standard output.
-v number Specifies the type of messages directed to stdout. If you do not specify this flag, the default is to direct only error messages to stdout. If you specify number to be 1, both errors and the names of partially recovered files are directed to stdout. If you specify number to be 2, error messages and the status of all files as they are recovered are dirto stdout.
-V special [-V special]
Specifies block device special file names of volumes in the domain, /dev/disk/dsk3c. The utility attempts to recover files only from the volumes you specify. If you do not specify the -V flag, you must specify the domain parameter so that the utility can obtain tspecial file names of the volumes in the domain from the /etc/fdmns directory. Do not use this flag with the domain parameter. If you do, an error message is displayed andutility exits with an exit value of 2.
-x Specifies that partially recoverable files are not to be recovered. If you do not use this fpartially recoverable files are recovered. Do not use the -x flag with the -p flag. If you do, the utility displays an error message and exits with an exit value of 2.
AdvFS Commands and Utilities A-35
AdvFS Commands and Utilities
Operation
The salvage utility helps you recover file data after an AdvFS file domain has become unmountable due to some type of data corruption. Errors that could cause data corruption of a file domain include I/O errors in file system metadata, the accidental removal of a volume, or any I/O error that produces a panic.
Use the salvage utility as a last resort. You should first repair domain structures by using the verify utility. If that repair method is unsatisfactory, attempt to recover fileset data from backup media. Only if both methods are unsatisfactory should you employ the salvage utility.
The salvage utility opens and reads block devices directly and could present a security issue if it recovers data remaining from previous AdvFS file domains while attempting to recover data from current AdvFS file domains.
The salvage utility can be run in single user mode, without mounting other file systems. The salvage utility is available from the UNIX Shell option when you are booting from the Tru64 UNIX Operating System Volume 1 CDROM.
The salvage utility can find metadata on disk that appears valid but might not be: in most cases, the utility can determine when this suspect metadata should be used or ignored. One of these problems that the utility cannot detect is the situation when the metadata contains a tag number that could be valid on a fileset with a very large number of files, but is usually invalid for common filesets. In this case, the utility creates a partial file in the lost+found directory.
The salvage utility has a built-in soft limit on the number of valid tags in a fileset: 10,000,000 tags. If an application should exceed this soft limit, the user is prompted about increasing the limit.
You must have root user privilege to use the salvage utility.
Before using the salvage utility, all filesets in the domain you are trying to recover probably have already been unmounted. However, use the umount(8) command to ensure that the filesets are unmounted.
savemetaDescription
savemeta [-LSTtf] domain savedir
This script saves a snapshot of the specified domain metadata into a directory, savedir, that has the following structure:
/savedir/volume_directory/BMT_file
/log_file
/tag_file
/fileset_directory/frag_file
A-36 AdvFS Commands and Utilities
AdvFS Commands and Utilities
/tag_file
Options
shblkDescription
The shblk command displays unformatted disk blocks.
/sbin/advfs/shblk [-sb start_block] [-bc block_count] special
special specifies the volume on which the block(s) are located.
Options
Operation
The shblk command displays an unformatted hexadecimal listing of the information that is present in the selected blocks.
You must have root user privileges to access this command.
shfragbfDescription
Use this command to display how much space is used on the frag file.
/sbin/advfs/shfragbf file_system /.tags/l
file_system specifies the fileset mount point of the file system to display.
Operation
This command also displays the following frag file information:
• The frag type is listed as 0K when the frag file is not in use. The type is listed as 1K for 1K frags, and so forth.
• Grps specifies the number of groups of the frag type.
• bad specifies the number of bad group headers of this type.
-L Does not write the domain’s log file to the savedir.
-S Does not save the volume’s SBM to the savedir.
-T Does not save the domain’s root tag file to the savedir.
-t Does not save the fileset tag files to the savedir.
-f Saves the structure information from the frag file in each fileset to the savedir.
-sb start_block Specifies the volume on which the block(s) are located.
-bc block_count Specifies the number of blocks to print.
AdvFS Commands and Utilities A-37
AdvFS Commands and Utilities
• Frags specifies the number of fragments of this type.
• free specifies the number of free frags of this type.
• in-use specifies the number of fragments in use.
• Bytes specifies the total bytes in this frag type.
You must have root user privilege to access this command.
showfdmnDescription
The showfdmn command displays the attributes of a file domain and detailed information about each volume in the file domain.
/sbin/showfdmn [-k] domain
domain specifies the name of an existing AdvFS file domain.
Options
Operation
The showfdmn command displays the following file domain attributes:
• Id is a unique number (in hexadecimal format) that identifies the file domain.
• Date Created is the day, month, and time that a file domain was created.
• LogPgs is the number of 8-kilobyte pages in the transaction log of the specified file domain.
• Version is an internal-use-only version number for the AdvFS on-disk data structures. This number is not related to the version number of the base operating system.
• Domain Name is the name of the file domain.
The command also displays the following volume information:
• Vol is the volume number within the file domain. An L next to the number indicates that the volume contains the transaction log.
• 512-Blks is the size of the volume in 512-byte blocks
• 1K-Blks is the size of the volume in 1K blocks.
• Free is the number of blocks in a volume that are available for use.
• % Used is the percent of the volume space that is currently allocated to files or metadata.
-k Displays the total number of blocks and the number of free blocks in terms of 1K blocks instead of the default 512-byte blocks.
A-38 AdvFS Commands and Utilities
AdvFS Commands and Utilities
m.
• Cmode is the I/O consolidation mode. The default is on.
• Rblks is the maximum number of 512-byte blocks read from the volume at one time.
• Wblks is the maximum number of 512-byte blocks written to the volume at one time.
• Vol Name is the name of the special device file for the volume.
For multivolume file domains, the showfdmn command also displays the total volume size, total number of free blocks, and the total percent of volume space currently allocated.
A file domain must be active before the showfdmn command can display volume information. A file domain is active when at least one fileset in the file domain is mounted.
showfileDescription
The showfile command displays the attributes of one or more AdvFS files.
/usr/sbin/showfile [-i] [-h | -x] filename...
filename... is one or more directory or file names. If you do not supply filename arguments, you can use an asterisk (*) to display all the files in the current directory.
Options
Operation
The showfile command also displays the extent map of each file. An extent is a contiguous area of disk space that the file system allocates to a file. Simple files have one extent map; striped files have an extent map for every stripe segment.
You can list AdvFS attributes for an individual file or the contents of a directory. Although the showfile command lists both AdvFS and non-AdvFS files, the command displays meaningful information for AdvFS files only.
The showfile command displays the following file attributes:
• Id is the unique number (in hexadecimal format) that identifies the file. Digits to the left of the dot (.) character are equivalent to a UFS inode.
-h Displays the raw extent map including any holes.
-i When a filename is a directory, displays the attributes for the directory’s index file.(V5.x only)
-x Displays full storage allocation map (extent map) for files in an Advanced File Syste
AdvFS Commands and Utilities A-39
AdvFS Commands and Utilities
n
f the e ates
n s
xtent
• Vol is the location of primary metadata for the file, expressed as a number. The data extents of the file can reside on another volume.
• PgSz is the page size in 512-byte blocks.
• Pages is the number of pages allocated to the file.
• XtntType is the extent type. The extent type can be simple, which is a regular AdvFS file without special extents; stripe, which is a striped file; symlink, which is a symbolic link to a file; usf, nfs, and so on. The showfile command cannot display attributes for symbolic links or non-AdvFS files.
• Segs is the number of stripe segments per striped file, which is the number of volumes a striped file crosses. (Applies only to stripe type.)
• SegSz is the number of pages per stripe segment. (Applies only to stripe type.)
• I/O is the type of write request to this file:
— async specifies that write requests are buffered (the AdvFS default).
— synch specifies forced synchronous writes as described in chfile(8).
— ftx specifies that write requests are executed under AdvFS transactiocontrol (reserved for metadata files and directories).
• Perf is the efficiency of file-extent allocation, expressed as a percentage ooptimal extent layout. A high percentage, such as 100%, indicates that thAdvFS I/O system has achieved optimal efficiency. A low percentage indicthe need for file defragmentation.
• File is the name of the directory or file. If the file is a directory that has aindex file associated with it and the -i flag has not been specified, the statisticdisplayed are for the directory. The term index follows the directory name. If the file is a directory that has an index file associated with it and the -i flag is specified, the statistics displayed are for the index file associated with thedirectory. The name of the directory follows the index.
Whereas a simple file has one extent map, a striped file has more than one emap. An extent map displays the following information:
• pageOff is the starting page number of the extent.
• pageCnt is the number of pages in the extent.
• vol is the location of the extent, expressed as a number.
• volBlock is the starting block number of the extent.
• blockCnt is the number of blocks in the extent.
• extentCnt is the number of extents.
A-40 AdvFS Commands and Utilities
AdvFS Commands and Utilities
rent
set
e ed .
and not t be
showfsetsDescription
The showfsets command displays the filesets (or clone filesets) and their characteristics in a specified domain.
/sbin/showfsets [-b | -q] [-k] domain [fileset...]
domain specifies the full path name of the file domain. fileset... specifies the name of one or more filesets.
Options
Operation
The following fileset characteristics are displayed:
• Fileset identifier is a combination of the file-domain identifier and an additional set of numbers that identify the fileset within the file domain.
• Clone status can include:
— Clone is specifies the name of a clone fileset, if one exists for the pafileset.
— Clone of specifies the name of the parent fileset, if the displayed fileis a clone fileset.
— Revision specifies the number of times you revised a clone fileset.
• Files specifies the number of files in the fileset and the current file usaglimits (quotas). SLim, the soft limit, is a quota that can be exceeded for a fixperiod of time and HLim, the hard limit, is a quota that cannot be exceeded
• Blocks specifies the number of blocks that are in use by a mounted filesetthe current block soft and hard usage limits (quotas). For filesets that are mounted, zero blocks will display. For an accurate display, the fileset musmounted.
• Quota Status specifies which quota types are enabled (enforced).
-b Lists the names of the filesets in a domain, without additional detail.
-k Displays the total number of blocks and the number of free blocks in terms of 1K blocks instead of the default 512-byte blocks.
-q Displays quota limits for filesets in a domain.
AdvFS Commands and Utilities A-41
AdvFS Commands and Utilities
The showfsets command with the -q flag set displays block and file information for a specified domain or for one or more named filesets in the domain. The characteristics of a named fileset are:
• BF (block flag) specifies block (B) and file (F) usage limits. A + in this field means that the soft block usage is exceeded; a * means that the hard limit is reached.
• Block (512) Limits specifies the number of blocks used, the soft limit (the number of blocks that can be exceeded for a period of time), the hard limit (the number of blocks that cannot be exceeded), and the grace period (the remaining time for which the soft limit may be exceeded).
• File Limits specifies the number of files used, the soft and hard file limits for the fileset, and the grace period remaining.
stripeDescription
The stripe utility enables you to improve the read/write performance of a file by spreading it evenly across several volumes in a file domain.
/usr/sbin/stripe -n volume_count filename
filename specifies the name of the file to stripe.
You must have root user privileges to access this command.
Operation
The stripe utility directs a zero-length file (a file with no data written to it yet) to be spread evenly across several volumes within a file domain. As data is appended to the file, the data is spread across the volumes. AdvFS determines the number of pages per stripe segment and alternates the segments among the disks in a sequential pattern.
Existing, nonzero-length files cannot be striped using the stripe utility. To stripe an existing file, create a new file, use the stripe utility to stripe the new file, and copy the contents of the file you want to stripe into the new striped file. After copying the file, delete the nonstriped file.
Once a file is striped, you cannot use the stripe utility to modify the number of disks that a striped file crosses. To change the volume count of a striped file, you can create a second file with a new volume count, and then copy the contents of the first file into the second file. After copying the file, delete the first file.
switchlogDescription
The switchlog command moves an AdvFS file domain transaction log.
/sbin/advfs/switchlog domain_name vol_id
A-42 AdvFS Commands and Utilities
AdvFS Commands and Utilities
domain_name specifies the name of an existing file domain. vol_id specifies the number of the new volume to use for the log.
Operation
The switchlog command relocates the transaction log of the specified file domain to a different volume in the same file domain. Moving the transaction log within a multivolume file domain is typically done to place the log on either a faster, less congested, or mirrored volume.
Use the showfdmn command to determine the current location of the transaction log. In the showfdmn command display, the letter L displays next to the volume that contains the log. The showfdmn command also displays all of the volumes and their volume numbers.
You must have root user privilege to execute this command.
tag2nameDescription
The tag2name command displays the path name of a file given the tag number.
/sbin/advfs/tag2name tags_directory/tag/sbin/advfs/tag2name [-r] domain fileset tag
domain specifies the name of an AdvFS file domain. fileset specifies the name of an AdvFS fileset. tags_directory specifies the relative path of the AdvFS tags directory for the fileset. tag specifies the AdvFS file tag number.
Options
Operation
Internally, AdvFS identifies files by tag numbers (similar to inodes in UFS). Internal messages, error messages, and output from diagnostic utilities usually specify a tag number in place of a file name. Use the tag2name command to determine the name and path of the file identified by a tag number.
Each mounted AdvFS fileset has a .tags directory in its mount point. To obtain the file name, specify the path to the .tags directory for the fileset, followed by the tag number. The full path name of the corresponding file will be printed to stdout.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the tag2name utility fails in this situation, run it again.
-r Specifies this flag to operate on the raw device (character device special file) of the fileset instead of the block device.
AdvFS Commands and Utilities A-43
AdvFS Commands and Utilities
t
is me,
e
p,
ted
You must have root user privilege to access this command. The tag you specify must be numeric and greater than 1.
vbmtchainDescription
The vbtmchain utility displays metadata for a file including the time stamp, extent map, and whether the file is a user directory or data file. (Valid for V4.x only. Use nvbmtpg for V5.0.)
/sbin/advfs/vbmtchain BMT_page cell special [special 2...]
BMT_page specifies the page within the bitfile metadata table (BMT) of the volume that contains the file’s mcell. cell specifies the cell of the BMT page thacontains the file’s mcell. special specifies the volume on which the file’s primary mcell is located. special 2... specifies the other volumes in this domain that may be accessed to follow the file’s mcell chain.
Operation
The file is described by the location of its primary mcell. Each mcell location composed of three parts: volume, page within the BMT file located on that voluand cell within the BMT page.
The primary mcell for the root tag directory is found in the BMT of the volumecontaining the log. To find this volume for a domain, use the showfdmn or thadvscan command. The volume marked "L" contains the log.
Certain metadata files are in fixed locations:
The vbmtchain utility displays the attributes of the file including the time stamthe extent map, and whether the file is a user directory or a data file.
You must have root user privileges to access this command.
vbmtpgDescription
The vbmtpg utility displays a complete, formatted page of the BMT for a mounor unmounted domain. (Valid for V4.x only. Use nvbmtpg for V5.0.)
/sbin/advfs/vbmtpg special [page_LBN]
Page Cell Volume
Bitfile metadata table 0 0 Every volume
Storage bitmap 0 1 Every volume
Root tag directory 0 2 Volume with log
Transaction log file 0 3 Volume with log
A-44 AdvFS Commands and Utilities
AdvFS Commands and Utilities
special specifies the volume on which the page is located. page_LBN specifies the logical block number (LBN) of the requested page; the default is 32 which is page zero of the bitfile metadata table (BMT).
Operation
The vbmtpg utility is useful for debugging when there has been some seemingly random file corruption.
Note that the vbmtchain command displays all the mcells associated with a given file whereas the vbmtpg command displays a page of information. This page may contain information for more than one file and may not provide complete information on any file.
vdfDescription
The vdf utility displays disk information for AdvFS domains and filesets. This command is new in Tru64 UNIX V5.0
/sbin/advfs/vdf [-k][-l] domain | domain#fileset
domain is the full path name of an AdvFS file domain. When a domain argument is specified, the default display contains information about: the number of disk blocks allocated to the domain; the number of disk blocks in use by the domain; and the number of disk blocks that are available to the domain.
domain#fileset is the name of an AdvFS fileset in an AdvFS file domain. When a domain#fileset argument is specified, the default display contains information about: the number of disk blocks allocated to the fileset; the number of disk blocks in use by the fileset; and the number of disk blocks that are available to the fileset. This information is in the same format as that displayed by the df command.
Options
Operation
The vdf utility is a script that reformats output from the showfdmn, showfsets, shfragbf, and df utilities in order to display information about the disk usage of AdvFS file domains and filesets. In addition, the utility computes and displays the sizes of metadata files in a domain or fileset.
The disk space used by clone filesets is not calculated. If clone filesets are present in the specified domain, the utility displays a warning message.
-k Displays disk blocks as 1024-byte blocks instead of the default of 512-byte blocks.
-l Specifies that the default information for both the domain and filesets is reformatted to show the relationships between them. For example, any domain metadata displayed is the total metadata shared by filesets in the domain.
AdvFS Commands and Utilities A-45
AdvFS Commands and Utilities
You must have root user privilege to access this command.
This command cannot be used on filesets that are NFS mounted. All filesets in a domain must be mounted in order to calculate the disk usage of the domain.
vdumpDescription
The vdump (rvdump) utility performs full and incremental backups on filesets.
/sbin/vdump -h/sbin/vdump -V/sbin/vdump -w/sbin/vdump [-0..9] [-CDNUquv] [-F num_buffers] [-T tape_num] [-b size] [-f device] [-x num_blocks] fileset/sbin/rvdump -h/sbin/rvdump -V/sbin/rvdump -w/sbin/rvdump [-0..9] [-CDNUquv] [-F num_buffers] [-T tape_num] [-b size] [-f nodename:device] [-x num_blocks] fileset
fileset specifies the full path name of a mounted AdvFS fileset to be backed up. Alternatively, specifies a mounted NFS or UFS file system. When used with the -D flag, specifies a subdirectory.
Options
-h Displays usage help for the command.
-V Displays the current version of the command.
-w Displays the filesets that have not been backed up within one week.
-0..9 Specifies the backup level. The value 0 for this flag causes the entire fileset to be backed up to the storage device. The default backup level is 9.
-C Compresses the data as it is backed up, which minimizes the saveset size.
-D Performs a level 0 backup on the specified subdirectory. This flag overrides any backup level specification in the command. If this flag is specified, the AdvFS user and group quota files and the fileset quotas are not backed up.
-N Does not rewind the storage device, when it is a tape.
-P Produces backward compatible savesets that can be read by earlier versions of the vrestore command.
-U Does not unload the storage device, when it is a tape.
-q Displays only error messages; does not display information messages.
-u Updates /etc/vdumpdates file with a timestamp entry from beginning of backup.
-v Displays the names of the files being backed up.
A-46 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
The vdump command backs up files and any associated extended attributes (including ACLs, see the proplist(4) and acl(4) reference pages) from a single mounted fileset or clone fileset to a local storage device.
The rvdump command backs up files and any associated extended attributes (including ACLs, see the proplist(4) and acl(4) reference pages) from a single mounted fileset or clone fileset to a remote storage device.
The vdump and rvdump commands are the backup facility for the AdvFS file system. However, the commands are file-system independent, and you can use them to back up other file systems, such as UFS and NFS.
The commands back up all files in the specified fileset that are new or changed since a certain date and produce a saveset on the storage device. The date is determined by comparing the specified backup level to previous backup levels recorded in the /etc/vdumpdates file. The default storage device is /dev/tape/tape0_d1. You can specify an alternate storage device by using the -f flag.
The commands perform either an incremental backup, level 9 to 1, or a full backup, level 0, depending on the desired level of backup and the level of previous backups recorded in the /etc/vdumpdates file. The commands back up all files that are new or have changed since the latest backup date of all backup levels that are lower than the backup level being performed. If a backup level that is lower than the specified level does not exist, the commands initiate a level 0 backup. A level 0 backup backs up all the files in the fileset.
-F num_buffers Specifies the number of in-memory buffers to use. The valid range is 2 through 64 buffers; the default is 8 buffers. The size of the in-memory buffers is determined by the value of the -b flag.
-T tape_num Specifies the starting number for the first tape. The default number is 1. The tape number is used only to prompt the operator to load another tape in the drive.
-b size Specifies the number of 1024-byte blocks per record in the saveset. The valid range is 1 through 64 blocks; the default is 60 blocks per record. The value of this flag also determines the size of the in-memory buffers.
-f device
-f node:device
Specifies the destination of the saveset. For vdump, the local destination can be a device, a file, or, when the - (dash) character is specified, standard output. For rvdump, the specification must be in the format nodename:device to specify the remote machine name that holds the device, file, or standard output.
-x num_blocks Specifies an "exclusive or" (XOR) operation each time blocks specified by num_blocks are written to saveset. XOR operation is performed on the blocks and results written to saveset as an XOR block that immediately follows the blocks. Subsequently, you can use the vrestore command to recover one of the blocks in the group should a read error occur. The valid range is 2 through 32 blocks; the default is 8 blocks. Using the -x flag creates larger savesets and increases the amount of time required to back up a file system, but offers additional protection from saveset errors.
AdvFS Commands and Utilities A-47
AdvFS Commands and Utilities
After the backup operation is complete, you can use the vrestore -t command to verify that the backup contains the files you wanted to save. This command lists the name and size of each file in the saveset without restoring them.
The vdump and rvdump commands do not back up filesets that are not mounted. Filesets backed up by using the vdump or the rvdump command must be restored by using the vrestore or the rvrestore command. The vdump and rvdump commands are not interchangeable with the dump and rdump commands. Similarly, the vrestore and the rvrestore commands are not interchangeable with the restore and rrestore commands.
The vrestore command in DIGITAL UNIX versions earlier than Version 4.0 cannot be used to restore savesets produced by the vdump command in DIGITAL UNIX Version 4.0 or higher systems.
The /etc/vdumpdates file is written in ASCII and consists of a single record per line. You must be the root user to update this file or to change any record field. If you edit the /etc/vdumpdates file, be certain that all records follow the correct format. An incorrectly formatted record in this file may make the file inaccessible for updates or reads.
See the manpage for more information.
verifyDescription
The verify command checks on-disk structures such as the bitfile metadata table (BMT), the storage bitmaps, the tag directory and the frag file for each fileset. The verify command should be used in place of the msfsck and vchkdir commands available in DIGITAL UNIX V3.2x.
/sbin/advfs/verify [-a | -f] [-l | -d] [-v | -q] [-t] [-r] [-F] domain_name
domain_name specifies the file domain.
Options
-a Checks an active domain. All filesets of a domain must be mounted.
-f Creates a symbolic link to "fix" a lost file in the /mount_point/lost+found directory; deletes any directory entries without associated files; deletes files that have storage-bitmap or extent-map problems; corrects inconsistencies in the storage bitmap.
-d Deletes lost files (that is, with no directory entry).
-D Checks a domain previously mounted with the -o dual option of the mount command.
-l Creates a symbolic link to the lost file in the /mountpoint/lost+found directory.
-v Prints file status information. Selecting this flag slows down the verify procedure.
-q Prints minimal file status information.
A-48 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
This command verifies that the directory structure is correct and that all directory entries reference a valid file (tag) and that all files (tags) have a directory entry.
The verify command checks the storage bitmap for double allocations and missing storage. It checks that all mcells in use belong to a bitfile and that all bitfiles have all of their mcells.
The verify command checks the consistency of free lists for mcells and tag directories. It checks that the mcells pointed to by tags in the tag directory match the corresponding mcells.
For each fileset in the specified file domain, the verify command checks the frag file headers for consistency. For each file that has a fragment, the frag file is checked to ensure that the frag is marked as in use.You must have root user privilege to access this command.
Unless you are checking the root domain, all filesets in the file domain must be unmounted. The verify command automatically mounts all of the filesets in a file domain individually. If you choose the -r option when you run the verify command on the root domain, all filesets in the root domain must be mounted.
Run verify on /root and /usr from a single user mode. To run verify in single user mode, you must first run a mount update on the root (mount -u /). To run the command from multiuser mode, dismount any file system that you have mounted as /root or /usr and make sure there is no file activity.
If you run the verify command on a fileset that has any other file system (AdvFS or otherwise) mounted on it, an error results. If you have a fileset erroneously labeled as UFS and it overlaps a fileset labeled AdvFS, an error results. You can recover from this error by changing the erroneously-labeled fileset’s fstype field from ufs to unused with the disklabel -s command. After changing the disk label, run the verify command.
If the -F option is specified and the verify command is unable to mount a fileset due to a failure of the file domain, the fileset is mounted using the mount -d option. Use this option with extreme caution and only as a last resort when you cannot mount a fileset. The mount -d option mounts an AdvFS fileset without running recovery on the file domain. Mounting without running recovery will cause your file domain to be inconsistent.
If you use the -F option, the verify command starts some recovery on the file domain before you mount it.
-t Displays the mcell totals.
-r Checks the root domain.
-F Mounts filesets of a file domain using mount -d option if there is a mount failure of the file domain. Use this flag with caution.
AdvFS Commands and Utilities A-49
AdvFS Commands and Utilities
vfileDescription
The vfile utility outputs the contents of a file from an unmounted domain. In Tru64 UNIX Version 5.0, the vfilepg command should be used in place of this command.
/sbin/advfs/vfile BMT_page cell special [special 2 ...]
BMT_page specifies the page within the bitfile metadata table (BMT) of the volume that contains the file’s mcell. cell specifies the cell of the BMT page that contains the file’s mcell. special specifies the volume on which the file’s primary mcell is located. special 2... specifies the other volumes in this domain that may be accessed to follow the file’s mcell chain.
Operation
The file is identified by the location of its primary mcell. Each mcell location is composed of three parts: volume, page within the BMT file located on that volume, and cell within the BMT page.
The primary mcell for the root tag directory is found in the BMT of the volume containing the log. To find this volume for a domain, use the showfdmn or the advscan command. The volume marked "L" contains the log.
Certain metadata files are in fixed locations:
You must have root user privilege to access this command.
vfilepgDescription
The vfilepg command displays pages of an AdvFS file. This command is new in Tru64 UNIX V5.0. The vfilepg command should be used in place of the vfile command.
/sbin/advfs/vfilepg domain_id fileset_id file_id [ page | -a ] [-f d ]/sbin/advfs/vfilepg volume_id -b block/sbin/advfs/vfilepg domain_id fileset_id file_id -d dump_file/sbin/advfs/vfilepg [-F] dump_file [ page | -a ] [-f d ]
Page Cell Volume
Bitfile metadata table 0 0 Every volume
Storage bitmap 0 1 Every volume
Root tag directory 0 2 Volume with log
Transaction log file 0 3 Volume with log
A-50 AdvFS Commands and Utilities
AdvFS Commands and Utilities
domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
Specify the -r flag to operate on the raw device (character device special file) of the domain instead of the block device. Specify the -D flag to force the utility to interpret the name you supply in the domain argument as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can be a full or partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
fileset_id specifies an AdvFS fileset using the following format:
[-S] fileset | -T fileset_tag
Specify the -S flag to force the command to interpret the name you supply as a fileset name. Specify the fileset by entering either the name of the fileset, fileset, or the fileset’s tag number, -T fileset_tag.
file_id specifies a file name in the following format:
file | [-t] file_tag
Specify the file by entering either the file’s pathname, file, or the file’s tag number, -t file_tag.
dump_file specifies the name of a file that contains the output from this utility. page specifies the file page number of a file.
Options
-a Specifies that all the pages in the file be displayed.
-b block Specifies the logical block number of a disk block on an AdvFS volume.
-d dump_file Specifies the name of a file that will contain the output of this utility.
-f d Specifies that the output is to be formatted in a directory hierarchy. The default, if this flag is not specified, is to format the output as a hexadecimal and ASCII dump.
AdvFS Commands and Utilities A-51
AdvFS Commands and Utilities
Operation
The vfilepg utility formats, dumps, and displays AdvFS file pages. A file page is the unit of disk storage for AdvFS file: 8 Kbytes of contiguous disk space.
The utility has the following functions:
• Format and display one file page or all the file pages of a file. The file can be in a mounted or unmounted fileset.
• Save the contents of a file in one fileset to a file in another fileset. The file written is called a dump file. The source file can be in a mounted or unmounted fileset; the output fileset must be mounted.
• Format and display a dump file that has been dumped using the utility.
• Format and display a disk block of a file. A disk block is always 512 bytes and is located by specifying its logical block number.
You can specify which file page is to be displayed (page zero is the default), or you can display all the file pages in a file. The default display of file page information is in hexadecimal and ASCII formats. If you use the -f d flag, you can specify that the data be formatted as a directory page as it is displayed.
The utility displays one 8 Kbyte file page unless you specify the -b or -a flags. In those cases, the utility displays one 512-byte disk block.
It can be misleading to use this utility with the -r flag on a domain with mounted filesets. The utility does not synchronize its read requests with AdvFS file domain read and write requests. For example, the AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. To avoid this problem, do not use the -r flag on an active domain.
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the vfilepg utility fails in this situation, run it again.
-T fileset_tag Specifies the tag number for a fileset.
-t fileset_tag Specifies the tag number for a file.
volume Specifies the name of an AdvFS volume in an AdvFS file domain.
volume_index Specifies the index number of a volume in and AdvFS file domain.
A-52 AdvFS Commands and Utilities
AdvFS Commands and Utilities
vfragpgDescription
The vfragpg command displays a single header page of a frag file. In Tru64 UNIX Version 5.0, the nvfragpg command should be used in place of this command.
/sbin/advfs/vfragpg special page_LBN
special specifies the block special device name, such as /dev/rz2c. page_LBN specifies the logical block number of the page.
Operation
The vfragpg command allows you to see the structure of a single header page in a frag file.
Use showfile -x /usr/.tags/1 to locate the logical block number of the page.
You must have root user privileges to access this command.
vlogpgDescription
The vlogpg utility translates a 16-block part of a volume of an unmounted file system and formats it as a log page. In Tru64 UNIX Version 5.0, the nvlogpg command should be used in place of this command.
/sbin/advfs/vlogpg special [page_LBN]
special specifies the volume on which the log is located. page_LBN specifies the logical block number (LBN) of the volume; the default is zero.
Operation
Use this utility with other file utilities for debugging. If the volume is mounted, use the showfdmn -x command to get the extent map before calling the vlogpg command. If the volume is unmounted, use the vbmtchain command to identify the extent information that locates the log.
The vlogpg utility displays the pages and records needed to redo transactions that were in progress at the time of a crash.
You must have root user privileges to access this command.
vlsnpgDescription
The vlsnpg command displays the logical sequence number (LSN) of a page of the log. In Tru64 UNIX Version 5.0, the nvlogpg command should be used in place of this command.
/sbin/advfs/vlsnpg special [page_LBN]
AdvFS Commands and Utilities A-53
AdvFS Commands and Utilities
special specifies the volume on which the page is located. page_LBN specifies the logical block number (LBN) of the requested page; the default is zero.
Operation
Given the device and the LBN, the vlsnpg utility displays the logical sequence number of the page of the log. The page takes on the logical sequence number (LSN) of its first record. Use this command in a script to loop through logical sequence numbers for several pages to find the end of the log.
You must have root user privileges to access this command.
vrestoreDescription
The vrestore ( rvrestore) command restores files from savesets that are produced by the vdump and rvdump commands.
/sbin/vrestore -h/sbin/vrestore -V/sbin/vrestore -t [-f device]/sbin/vrestore -l [-f device]/sbin/vrestore -i [-mqv] [-f device] [-D path] [-o opt]/sbin/vrestore -x [-mqv] [-f device] [-D path] [-o opt] [file ...]/sbin/rvrestore -h/sbin/rvrestore -V/sbin/rvrestore -t [-f nodename:device]/sbin/rvrestore -l [-f nodename:device]/sbin/rvrestore -i [-mqv] [-f nodename:device] [-D path] [-o opt]/sbin/rvrestore -x [-mqv] [-f nodename:device] [-D path] [-o opt] [file ...]
Options
-h Displays usage help for the command.
-V Displays the current version for the command.
-t Lists the names and size (in bytes) of all f iles contained in a saveset. Exception: the sizes of any AdvFS quota files are not shown.
-l Lists the entire saveset structure.
-i Permits interactive restoration of files read from a saveset. See the manpage for more information.
-x Extracts a specific file or files from the saveset. Use this command as an alternate to using the add command in interactive mode. The -x flag can precede any other options, but the file ... list must be the last item on the command line.
-m Does not preserve the owner, group, or modes of each file from the device.
-q Prints only error messages; does not print information messages.
A-54 AdvFS Commands and Utilities
AdvFS Commands and Utilities
Operation
The vrestore and rvrestore commands restore data from a saveset previously archived by the vdump command or the rvdump command. The data, which can be restored from a file, a pipe, or a storage device (typically tape), is written to the specified directory. The default storage device from which files are read is /dev/tape/tape0_d1. You can use the -f flag to specify a different saveset. The vrestore and rvrestore commands restore any associated extended attributes, including ACLs, in the archive data. See the proplist(4) and acl(4) reference pages.
The vrestore and rvrestore commands are the restore facility for the AdvFS file system. However, the commands can be used to restore UFS and NFS files which have been archived by using the vdump or rvdump commands.
The default directory into which the files are restored is the current directory. You can specify an alternate directory by using the -D flag.
Use the -t flag to list the file names and sizes of the files in a saveset without restoring any files. When you are using the interactive shell and the AdvFS user and group quota files are available in the saveset for restoration, the file names used to refer to them will be quota.user and quota.group, regardless of what the quota files are named in either the backed up fileset or in the destination fileset. Restoration of the quota files does not change the names of the quota files in the destination fileset.
-v Writes the name of each file read from the storage device to the standard output device. Without this flag the vrestore command does not notify you about progress on reading from the storage device.
-f device
-f node:device
When an argument follows the -f flag, it specifies the name of the storage device that contains the saveset to be restored. The argument replaces the default device /dev/tape/tape0_d1.
For rvrestore, the specification must be in the format nodename:device to specify the remote machine name that holds the saveset to be restored.
-D path Specifies the destination path of where to restore the files. Without the -D flag, the files are restored to the current directory.
-o opt Specifies the action to take when a file already exists. The options are:
• yes, overwrite existing files without any query. This is the default.
• no, do not overwrite existing files.
• ask, ask whether to overwrite an existing file.
file... Specifies the file or files to restore when using the -x flag. All other flags must precede any file names on the command line.
AdvFS Commands and Utilities A-55
AdvFS Commands and Utilities
If the destination fileset is AdvFS, and the saveset contains AdvFS fileset quotas, the fileset quotas are restored, even when they differ from the fileset quotas of the destination fileset. By using the -o no or -o ask options, you can prevent this behavior.
The vdump and rvdump commands can write many savesets to a tape. If you want to use the vrestore or the rvrestore commands to restore a particular saveset, you must first position the tape to the saveset by using the mt command with the fsf option. For example, to position a tape that is rewound at the beginning of its second saveset, you can enter the command mt fsf 1.
The vdump and vrestore commands maintain the sparseness of AdvFS sparse files. However, sparse files that have been striped are still handled in the fashion of releases earlier than DIGITAL UNIX Version 4.0D: they are allocated disk space and filled with zeros.
You do not have to be the root user to use the vrestore command or the rvrestore command, but you must have write access to the directory to which you want to restore the files.
See rsh(8) for server and client access rules when using the rvdump or rvrestore commands.
Filesets that have been archived by using the vdump or rvdump commands must be restored by using the vrestore or rvrestore commands. The vdump and rvdump commands are not interchangeable with the dump and rdump commands. Similarly, the vrestore and rvrestore commands are not interchangeable with the restore and rrestore commands.
Only the root user can restore AdvFS quota files and fileset quotas. A warning message is displayed when a non-root user attempts to use the vrestore command to restore AdvFS quota files or fileset quotas. The vrestore command in DIGITAL UNIX versions earlier than Version 4.0 cannot be used to restore savesets produced by the vdump command in DIGITAL UNIX Version 4.0 or higher systems.
AdvFS quota files can be restored to either an AdvFS fileset or a UFS file system, but UFS quota files cannot be restored to an AdvFS fileset. If AdvFS quota files are to be restored to a UFS file system, quotas must be enabled on the UFS file system. Otherwise, the operation fails.
AdvFS fileset quotas cannot be restored to an UFS file system because there is no UFS analog to AdvFS fileset quotas. Attempting to do a vrestore or rvrestore to a base directory that has a default ACL or a default access ACL may cause unintended ACLs to be created on the restored files and directories. If ACLs are enabled on the system, check all ACLs after the vrestore or rvrestore.
A-56 AdvFS Commands and Utilities
AdvFS Commands and Utilities
e.
ify main
a full
.
vsbmpgDescription
The vsbmpg command displays a page from a storage bitmap (SBM) file. This command is new in Tru64 UNIX V5.0.
/sbin/advfs/vsbmpg [-v] sbm_id | domain_id/sbin/advfs/vsbmpg sbm_id page [entry]/sbin/advfs/vsbmpg sbm_id -a/sbin/advfs/vsbmpg smb_id -i index/sbin/advfs/vsbmpg sbm_id -B block/sbin/advfs/vsbmpg volume_id -b block/sbin/advfs/vsbmpg volume_id -d dump_file
sbm_id specifies an SBM file using the following format:
volume_id | [-F] dump_file
The dump_file parameter is a previously-saved copy of the fileset’s SBM filYou can use the -F flag to force the utility to interpret the dump_file parameter as a file name if it has the same name as a domain name.
domain_id specifies an AdvFS file domain using the following format:
[-r] [-D] domain
By default, the utility opens all volumes using block device special files. Specthe -r flag to operate on the raw device (character device special file) of the doinstead of the block device. Specify the -D flag to force the utility to interpret thename you supply in the domain argument as a domain name.
volume_id specifies an AdvFS volume using the following format:
[-V] volume | domain_id volume_index
Specify the -V flag to force the utility to interpret the name you supply in the volume argument as a volume name. The volume name argument also can beor partial path for the volume, for example /dev/disk/dsk12a or dsk12a. Specifying a partial path name always opens the character device special file
Alternatively, specify the volume by using arguments for its domain, domain_id, and its volume index number, volume_index.
page specifies the file page number of the SBM file.
entry specifies the index of the SBM word on the page.
AdvFS Commands and Utilities A-57
AdvFS Commands and Utilities
Options
Operation
Storage bitmaps (SBMs) are used to track free and allocated disk space of AdvFS volumes. Each volume in an AdvFS domain has one SBM file. The vsbmpg utility displays pages of a SBM file.
Using the vsbmpg command you can:
• Display SBM page summaries
• Display an SBM file page
• Display one SBM entry
• Display corrupted volumes
• Save or display an SBM file
For more information, see the vsbmpg reference page.
An active domain, which is a domain with one or more of its filesets mounted, has all of its volumes opened using block device special files. These devices cannot be opened a second time without first being unmounted. However, the character device special files for the volumes can be opened more than once while still mounted.
It can be misleading to use this utility on a domain with mounted filesets because the utility does not synchronize its read requests with AdvFS file domain read and write requests.
For example, AdvFS can be writing to the disk as the utility is reading from the disk. Therefore, when you run the utility, metadata may not have been flushed in time for the utility to read it and consecutive reads of the same file page may return unpredictable or contradictory results. [The domain is not harmed.]
To avoid this problem, unmount all the filesets in the domain before using this utility.
-a Displays all the pages of the SBM file.
-B block Displays the portion of the SBM that maps the specified block.
-b block Specifies a starting block for the part of an AdvFS volume that you want to format as an SBM page.
-d dump_file Specifies the name of a file that contains the output of this utility.
-i index Displays the SBM word specified by the index.
-v Checks the checksum on each page of the SBM.
A-58 AdvFS Commands and Utilities
AdvFS Commands and Utilities
The utility can fail to open a block device, even when there are no filesets mounted for the domain and the AdvFS daemon, advfsd, is running. The daemon, as it runs, activates the domain for a brief time. If the vsbmpg utility fails in this situation, run it again.
You must have root user privilege to use this command.
vtagpgDescription
The vtagpg utility displays a formatted page of a tag file. In Tru64 UNIX Version 5.0, the nvtagpg command should be used in place of this command.
/sbin/advfs/vtagpg special [page_LBN]
special specifies the volume on which the tag file is located.
page_LBN specifies the logical block number (LBN) of the page to be examined; the default is zero.
Operation
The vtagpg utility formats a page of the disk as a tag file page. Use this utility with other file utilities to locate file-structure anomalies for debugging.
If the volume is mounted, use the showfile -x command to get the extent map before calling the vtagpg command. If the volume is unmounted, call the vbmtchain command to identify the extent information.
Run the vtagpg utility to obtain the root tag file first because it has entries for each fileset in the domain. Then, run the utility again to view the tag file for the fileset under investigation. This tag information points to the fileset metadata.
The vtagpg utility displays tag entries that map a file tag number to a primary mcell location.
You must be the root user to use this command.
AdvFS Commands and Utilities A-59
AdvFS Commands and Utilities
A-60 AdvFS Commands and Utilities
Index
Aaddvol A-3AdvFS
architecture 1-22, 1-23directories and migration 2-36file addresses 2-8in-memory structures overview 3-3POSIX files 2-35two-level implementation 2-3, 2-4UNIX directories 2-35
AdvFS commandaddvol A-3advfsstat A-6advscan A-7balance A-9chfile A-10chfsets A-12chvol A-13defragment A-14logread A-16migrate A-16mkfdmn A-18mkfset A-20mountlist A-21msfsck A-21ncheck A-21nvbmtpg A-22nvfragpg A-25nvlogpg A-27nvtagpg A-29rmfdmn A-32rmfset A-32rmvol A-33salvage 5-50, A-34shblk A-37shfragbf A-37showfdmn A-38showfile A-39showfsets A-41stripe A-42switchlog A-42tag2name A-43vbmtchain A-44vbmtpg A-44vchkdir A-45vdf A-45vdump A-46verify A-48version comparison table A-3vfile A-50vfilepg A-50
vfragpg A-53vlogpg A-53vlsnpg A-53vods A-54vrestore A-54vsbmpg A-57vtagpg A-59
AdvFS corruptioncauses 5-8recognizing 5-8
AdvFS entry pointdevice driver callback 4-6I/O completion function 4-7lightweight context interface 4-7UBC interface 4-6VFS switch table 4-4vnode switch table 4-5
AdvFS recovery pass 4-15AdvFS startup
activating domain table entry 4-14activating the bitfile-set 4-14activating the domain 4-14mounting the file system 4-13recovering a domain 4-14
AdvFS system calldomains and volumes 4-10example 4-25filesets 4-11true 4-8types 4-8
AdvFS threadfragment bitfile 4-23FS cleanup 4-24I/O 4-24overview 4-23
AdvFS troubleshootingBMT exhaustion 5-16corruption and system panic case study 5-34determining log size 5-14domain corruption case study 5-20domain panic 5-12fragment free list corruption case study 5-30generalized corruption 5-11localized corruption 5-10log half-full problem 5-14mount file system crashes system 5-9no valid file system error 5-9tips and practices 5-4
AdvFS volume 1-6advfsstat A-6advscan A-7
Index-1
Bbalance A-9BAS
access to structures 3-19in-memory structures 3-18storage allocation 4-16
bfAccess structurefinding 3-19managing 3-19
bfnode structure 3-9bfSet structure
finding 3-20bitfile
BAS on-disk metadata 2-10buffer descriptor 3-27definition 2-7migrating 4-22misc 2-47page references 3-28per bitfile-set 2-13per volume 2-12root tag directory 2-31SBM (storage bitmap) 2-43truncating 4-17
bitfile-set 2-9tag directory 2-31
BMTextents 2-21page 1 2-21page format 2-21
Cchfile A-10chfsets A-12chvol A-13clone
closing a deleted bitfile 4-20creating 4-18deleting a bitfile 4-20deleting bitfile from cloned original 4-20issues with 1-16reading from 4-19using clonefset command 1-15writing to a cloned original 4-19
Ddefragment A-14domain structure
finding 3-21
Eextent
definition 1-8, 1-9, 1-10displaying using showfile command 1-9
encoding 2-25extent based storage 1-8primary extent map record 2-27
extent mapsnonreserved files 2-25reserved files 2-25
FFAS
in-memory structures 3-9storage allocation 4-16
file domain 1-3, 1-4, 1-5file structures, in-memory 3-9fileset
definition 1-3, 1-4, 1-5deleting 4-22in-memory structures 3-13quota structures 3-16
fileSetNode structure 3-12fragment
bitfile 2-37groups 2-37header 2-38utilities 2-39
fragments and files 2-41free space cache 3-27fsContext structure 3-11FTX state structure 3-28
II/O descriptor 3-28
Llogging
a transaction 1-13process definition 1-13
logread A-16
Mmcell
addresses 2-22format 2-23overview 2-20page structure 2-20records 2-19, 2-23reserved addresses 2-22
migrate A-16mkfdmn A-18mkfset A-20mount structure 3-7mountlist A-21msfsck A-21
Index-2
Nncheck A-21nvbmtpg A-22nvfragpg A-25nvlogpg A-27nvtagpg A-29
Rrmfdmn A-32rmfset A-32rmvol A-33
Ssalvage A-34
definition 5-50example 5-52massive metadata corruption 5-54using with backup media 5-53when to use 5-53without backup media 5-54
SBM format 2-44shblk A-37shfragbf A-37showfdmn A-38showfile A-39showfsets A-41stripe A-42striping
file 1-17switchlog A-42
Ttag2name A-43tags
directories 2-27directory 2-10directory page 2-29metadata bitfile 2-16reusing 2-9tagmap entries 2-30utility for viewing tag directory 2-33
trash cans 1-19
Uuser/group quota structures 3-17
Vvbmtpg A-44vchkdir A-45vdf A-45vdump A-46verify A-48vfile A-50
vfilepg A-50vfragpg A-53vlogpg A-53vlsnpg A-53vnode structure 3-7vods A-54vrestore A-54vsbmpg A-57vtagpg A-59
Index-3
Index-4