Files
create/src/Base/Reader.h
Kevin Martin 83202d8ad6 Address the poor performance of the existing unique-name generation (#17944)
* Address the poor performance of the existing unique-name generation

As described in Issue 16849, the existing Tools::getUniqueName method
requires calling code to form a vector of existing names to be avoided.

This leads to poor performance both in the O(n) cost of building such a
vector and also getUniqueName's O(n) algorithm for actually generating
the unique name (where 'n' is the number of pre-existing names).

This has  particularly noticeable cost in documents with large numbers
of DocumentObjects because generating both Names and Labels for each new
object incurs this cost. During an operation such as importing this
results in an O(n^2) time spent generating names.

The other major cost is in the saving of the temporary backup file,
which uses name generation for the "files" embedded in the Zip file.
Documents can easily need several such "files" for each object in the
document.

This update includes the following changes:

Create UniqueNameManager to keep a list of existing names organized in
a manner that eases unique-name generation. This class essentially acts
as a set of names, with the ability to add and remove names and check if
a name is already there, with the added ability to take a prototype name
and generate a unique form for it which is not already in the set.

Eliminate Tools::getUniqueName

Make DocumentObject naming use the new UniqueNameManager class

Make DocumentObject Label naming use the new UniqueNameManager class.
Labels are not always unique; unique labels are generated if the
settings at the time request it (and other conditions). Because of this
the Label management requires additionally keeping a map of counts
for labels which already exist more than once.
These collections are maintained via notifications of value changes on
the Label properties of the objects in the document.

Add Document::containsObject(DocumentObject*) for a definitive
test of an object being in a Document. This is needed because
DocumentObjects can be in a sort of limbo (e.g. when they are in the
Undo/Redo lists) where they have a parent linkage to the Document but
should not participate in Label collision checks.

Rename Document.getStandardObjectName to getStandardObjectLabel
to better represent what it does.

Use new UniqueNameManager for Writer internal filenames within the zip
file.

Eliminate unneeded Reader::FileNames collection. The file names
already exist in the FileList collection elements. The only existing
use for the FileNames collection was to determine if there were any
files at all, and with FileList and FileNames being parallel
vectors, they both had the same length so FileList could be used
for this test..

Use UniqueNameManager for document names and labels. This uses ad hoc
UniqueNameManager objects created on the spot on the assumption that
document creation is relatively rare and there are few documents, so
although the cost is O(n), n itself is small.

Use an ad hoc UniqueNameManager to name new DymanicProperty entries.
This is only done if a property of the proposed name already exists,
since such a check is more-or-less O(log(n)), almost never finds a
collision, and avoids the O(n) building of the UniqueNameManager.
If there is a collision an ad-hoc UniqueNameManager is built
and discarded after use.
The property management classes have a bit of a mess of methods
including several to populate various collection types with all
existing properties. Rather than introducing yet another such
collection-specific method to fill a UniqueNameManager, a
visitProperties method was added which calls a passed function for
each property. The existing code would be simpler if existing
fill-container methods all used this.
Ideally the PropertyContainer class would keep a central directory of
all properties ("static", Dynamic, and exposed by ExtensionContainer and
other derivations) and a permanent UniqueNameManager. However the
Property management is a bit of a mess making such a change a project
unto itself.

The unit tests for Tools:getUniqueName have been changed to test
UniqueNameManager.makeUniqueName instead.
This revealed a small regression insofar as passing a prototype name
like "xyz1234" to the old code would yield "xyz1235" whether or
not "xyz1234" already existed, while the new code will return the next
name above the currently-highest name on the "xyz" model, which could
be "xyz" or "xyz1".

* Correct wrong case on include path

* Implement suggested code changes
Also change the semantics of visitProperties to not have any short-circuit return

* Remove reference through undefined iterator

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix up some comments for DOxygen

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-12-13 10:54:46 -06:00

395 lines
14 KiB
C++

/***************************************************************************
* Copyright (c) 2011 Jürgen Riegel <juergen.riegel@web.de> *
* *
* This file is part of the FreeCAD CAx development system. *
* *
* This library is free software; you can redistribute it and/or *
* modify it under the terms of the GNU Library General Public *
* License as published by the Free Software Foundation; either *
* version 2 of the License, or (at your option) any later version. *
* *
* This library is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU Library General Public License for more details. *
* *
* You should have received a copy of the GNU Library General Public *
* License along with this library; see the file COPYING.LIB. If not, *
* write to the Free Software Foundation, Inc., 59 Temple Place, *
* Suite 330, Boston, MA 02111-1307, USA *
* *
***************************************************************************/
#ifndef BASE_READER_H
#define BASE_READER_H
#include <bitset>
#include <map>
#include <memory>
#include <sstream>
#include <string>
#include <xercesc/framework/XMLPScanToken.hpp>
#include <xercesc/sax2/Attributes.hpp>
#include <xercesc/sax2/DefaultHandler.hpp>
#include <boost/iostreams/concepts.hpp>
#include "FileInfo.h"
namespace zipios
{
class ZipInputStream;
}
#ifndef XERCES_CPP_NAMESPACE_BEGIN
#define XERCES_CPP_NAMESPACE_QUALIFIER
using namespace XERCES_CPP_NAMESPACE;
namespace XERCES_CPP_NAMESPACE
{
class DefaultHandler;
class SAX2XMLReader;
} // namespace XERCES_CPP_NAMESPACE
#else
XERCES_CPP_NAMESPACE_BEGIN
class DefaultHandler;
class SAX2XMLReader;
XERCES_CPP_NAMESPACE_END
#endif
namespace Base
{
class Persistence;
/** The XML reader class
* This is an important helper class for the store and retrieval system
* of objects in FreeCAD. These classes mainly inherit the App::Persitance
* base class and implement the Restore() method.
* \par
* The reader gets mainly initialized by the App::Document on retrieving a
* document out of a file. From there subsequently the Restore() method will
* by called on all object stored.
* \par
* A simple example is the Restore of App::PropertyString:
* \code
void PropertyString::Save (short indent,std::ostream &str)
{
str << "<String value=\"" << _cValue.c_str() <<"\"/>" ;
}
void PropertyString::Restore(Base::Reader &reader)
{
// read my Element
reader.readElement("String");
// get the value of my Attribute
_cValue = reader.getAttribute("value");
}
* \endcode
* \par
* An more complicated example is the retrieval of the App::PropertyContainer:
* \code
void PropertyContainer::Save (short indent,std::ostream &str)
{
std::map<std::string,Property*> Map;
getPropertyMap(Map);
str << ind(indent) << "<Properties Count=\"" << Map.size() << "\">" << endl;
std::map<std::string,Property*>::iterator it;
for(it = Map.begin(); it != Map.end(); ++it)
{
str << ind(indent+1) << "<Property name=\"" << it->first << "\" type=\"" <<
it->second->getTypeId().getName() << "\">" ; it->second->Save(indent+2,str); str << "</Property>" <<
endl;
}
str << ind(indent) << "</Properties>" << endl;
}
void PropertyContainer::Restore(Base::Reader &reader)
{
reader.readElement("Properties");
int Cnt = reader.getAttributeAsInteger("Count");
for(int i=0 ;i<Cnt ;i++)
{
reader.readElement("Property");
string PropName = reader.getAttribute("name");
Property* prop = getPropertyByName(PropName.c_str());
if(prop)
prop->Restore(reader);
reader.readEndElement("Property");
}
reader.readEndElement("Properties");
}
* \endcode
* \see Base::Persistence
* \author Juergen Riegel
*/
class BaseExport XMLReader: public XERCES_CPP_NAMESPACE_QUALIFIER DefaultHandler
{
public:
enum ReaderStatus
{
PartialRestore =
0, // This bit indicates that a partial restore took place somewhere in this Document
PartialRestoreInDocumentObject = 1, // This bit is local to the DocumentObject being read
// indicating a partial restore therein
PartialRestoreInProperty = 2, // Local to the Property
PartialRestoreInObject = 3 // Local to the object partially restored itself
};
/// open the file and read the first element
XMLReader(const char* FileName, std::istream&);
~XMLReader() override;
/** @name boost iostream device interface */
//@{
using category = boost::iostreams::source_tag;
using char_type = char;
std::streamsize read(char_type* s, std::streamsize n);
//@}
bool isValid() const
{
return _valid;
}
bool isVerbose() const
{
return _verbose;
}
void setVerbose(bool on)
{
_verbose = on;
}
/** @name Parser handling */
//@{
/// get the local name of the current Element
const char* localName() const;
/// get the current element level
int level() const;
/// return true if the end of an element is reached, false otherwise
bool isEndOfElement() const;
/// return true if the on the start of the document, false otherwise
bool isStartOfDocument() const;
/// return true if the end of the document is reached, false otherwise
bool isEndOfDocument() const;
/// read until a start element is found (\<name\>) or start-end element (\<name/\>) (with
/// special name if given)
void readElement(const char* ElementName = nullptr);
/// Read in the next element. Return true if it succeeded and false otherwise
bool readNextElement();
/** read until an end element is found
*
* @param ElementName: optional end element name to look for. If given, then
* the parser will read until this name is found.
*
* @param level: optional level to look for. If given, then the parser will
* read until this level. Note that the parse only increase the level when
* finding a start element, not start-end element, and decrease the level
* after finding an end element. So, if you obtain the parser level after
* calling readElement(), you should specify a level minus one when calling
* this function. This \c level parameter is only useful if you know the
* child element may have the same name as its parent, otherwise, using \c
* ElementName is enough.
*/
void readEndElement(const char* ElementName = nullptr, int level = -1);
/// read until characters are found
void readCharacters(const char* filename, CharStreamFormat format = CharStreamFormat::Raw);
/** Obtain an input stream for reading characters
*
* @return Return a input stream for reading characters. The stream will be
* auto destroyed when you call with readElement() or readEndElement(), or
* you can end it explicitly with endCharStream().
*/
std::istream& beginCharStream(CharStreamFormat format = CharStreamFormat::Raw);
/// Manually end the current character stream
void endCharStream();
/// Obtain the current character stream
std::istream& charStream();
//@}
/// read binary file
void readBinFile(const char*);
//@}
/** @name Attribute handling */
//@{
/// get the number of attributes of the current element
unsigned int getAttributeCount() const;
/// check if the read element has a special attribute
bool hasAttribute(const char* AttrName) const;
/// return the named attribute as an integer (does type checking); if missing return
/// defaultValue
long getAttributeAsInteger(const char* AttrName, const char* defaultValue = nullptr) const;
/// return the named attribute as unsigned integer (does type checking); if missing return
/// defaultValue
unsigned long getAttributeAsUnsigned(const char* AttrName,
const char* defaultValue = nullptr) const;
/// return the named attribute as a double floating point (does type checking); if missing
/// return defaultValue
double getAttributeAsFloat(const char* AttrName, const char* defaultValue = nullptr) const;
/// return the named attribute as a double floating point (does type checking); if missing
/// return defaultValue
const char* getAttribute(const char* AttrName, const char* defaultValue = nullptr) const;
//@}
/** @name additional file reading */
//@{
/// add a read request of a persistent object
const char* addFile(const char* Name, Base::Persistence* Object);
/// process the requested file writes
void readFiles(zipios::ZipInputStream& zipstream) const;
/// Returns whether reader has any registered filenames
bool hasFilenames() const;
/// returns true if reading the file \a filename has failed
bool hasReadFailed(const std::string& filename) const;
bool isRegistered(Base::Persistence* Object) const;
virtual void addName(const char*, const char*);
virtual const char* getName(const char*) const;
virtual bool doNameMapping() const;
//@}
/// Schema Version of the document
int DocumentSchema {0};
/// Version of FreeCAD that wrote this document
std::string ProgramVersion;
/// Version of the file format
int FileVersion {0};
/// sets simultaneously the global and local PartialRestore bits
void setPartialRestore(bool on);
void clearPartialRestoreDocumentObject();
void clearPartialRestoreProperty();
void clearPartialRestoreObject();
/// return the status bits
bool testStatus(ReaderStatus pos) const;
/// set the status bits
void setStatus(ReaderStatus pos, bool on);
protected:
/// read the next element
bool read();
// -----------------------------------------------------------------------
// Handlers for the SAX ContentHandler interface
// -----------------------------------------------------------------------
/** @name Content handler */
//@{
void startDocument() override;
void endDocument() override;
void startElement(const XMLCh* const uri,
const XMLCh* const localname,
const XMLCh* const qname,
const XERCES_CPP_NAMESPACE_QUALIFIER Attributes& attrs) override;
void endElement(const XMLCh* const uri,
const XMLCh* const localname,
const XMLCh* const qname) override;
void characters(const XMLCh* const chars, const XMLSize_t length) override;
void ignorableWhitespace(const XMLCh* const chars, const XMLSize_t length) override;
//@}
/** @name Lexical handler */
//@{
void startCDATA() override;
void endCDATA() override;
//@}
/** @name Document handler */
//@{
void resetDocument() override;
//@}
// -----------------------------------------------------------------------
// Handlers for the SAX ErrorHandler interface
// -----------------------------------------------------------------------
/** @name Error handler */
//@{
void warning(const XERCES_CPP_NAMESPACE_QUALIFIER SAXParseException& exc) override;
void error(const XERCES_CPP_NAMESPACE_QUALIFIER SAXParseException& exc) override;
void fatalError(const XERCES_CPP_NAMESPACE_QUALIFIER SAXParseException& exc) override;
void resetErrors() override;
//@}
private:
int Level {0};
std::string LocalName;
std::string Characters;
unsigned int CharacterCount {0};
std::streamsize CharacterOffset {-1};
std::map<std::string, std::string> AttrMap;
using AttrMapType = std::map<std::string, std::string>;
enum
{
None = 0,
Chars,
StartDocument,
EndDocument,
StartElement,
StartEndElement,
EndElement,
StartCDATA,
EndCDATA
} ReadType {None};
FileInfo _File;
XERCES_CPP_NAMESPACE_QUALIFIER SAX2XMLReader* parser;
XERCES_CPP_NAMESPACE_QUALIFIER XMLPScanToken token;
bool _valid {false};
bool _verbose {true};
public:
struct FileEntry
{
std::string FileName;
Base::Persistence* Object;
};
std::vector<FileEntry> FileList;
private:
mutable std::vector<std::string> FailedFiles;
std::bitset<32> StatusBits;
std::unique_ptr<std::istream> CharStream;
};
class BaseExport Reader: public std::istream
{
public:
Reader(std::istream&, const std::string&, int version);
std::istream& getStream();
std::string getFileName() const;
int getFileVersion() const;
void initLocalReader(std::shared_ptr<Base::XMLReader>);
std::shared_ptr<Base::XMLReader> getLocalReader() const;
private:
std::istream& _str;
std::string _name;
int fileVersion;
std::shared_ptr<Base::XMLReader> localreader;
};
} // namespace Base
#endif