Files
create/src/Base/Writer.h
Kevin Martin edb8e4c937 Address performance of existing unique-name generation (Part 2) (#18676)
As described in Issue 16849, the existing Tools::getUniqueName method
requires calling code to form a vector of existing names to be avoided.

This leads to poor performance both in the O(n) cost of building such a
vector and also getUniqueName's O(n) algorithm for actually generating
the unique name (where 'n' is the number of pre-existing names).

This has  particularly noticeable cost in documents with large numbers
of DocumentObjects because generating both Names and Labels for each new
object incurs this cost. During an operation such as importing this
results in an O(n^2) time spent generating names.

The other major cost is in the saving of the temporary backup file,
which uses name generation for the "files" embedded in the Zip file.
Documents can easily need several such "files" for each object in the
document.

This update includes the following changes to use the newly-added
UniqueNameManager as a replacement for the old Tools::getUniqueName
method and deletes the latter to remove any temptation to use it as
its usage model breeds inefficiency:

Eliminate Tools::getUniqueName, its local functions, and its unit tests.

Make DocumentObject naming use the new UniqueNameManager class.

Make DocumentObject Label naming use the new UniqueNameManager class.
This needs to monitor DocumentObject Labels for changes since this
property is not read-only. The special handling for the Label
property, which includes optionally forcing uniqueness and updating
links in referencing objects, has been mostly moved from
PropertyString to DocumentObject.

Add Document::containsObject(DocumentObject*) for a definitive
test of an object being in a Document. This is needed because
DocumentObjects can be in a sort of limbo (e.g. when they are in the
Undo/Redo lists) where they have a parent linkage to the Document but
should not participate in Label collision checks.

Rename Document.getStandardObjectName to getStandardObjectLabel
to better represent what it does.

Use new UniqueNameManager for Writer internal filenames within the zip
file.

Eliminate unneeded Reader::FileNames collection. The file names
already exist in the FileList collection elements. The only existing
use for the FileNames collection was to determine if there were any
files at all, and with FileList and FileNames being parallel
vectors, they both had the same length so FileList could be used
for this test..

Use UniqueNameManager for document names and labels. This uses ad hoc
UniqueNameManager objects created on the spot on the assumption that
document creation is relatively rare and there are few documents, so
although the cost is O(n), n itself is small.

Use an ad hoc UniqueNameManager to name new DymanicProperty entries.
This is only done if a property of the proposed name already exists,
since such a check is more-or-less O(log(n)), almost never finds a
collision, and avoids the O(n) building of the UniqueNameManager.
If there is a collision an ad-hoc UniqueNameManager is built
and discarded after use.
The property management classes have a bit of a mess of methods
including several to populate various collection types with all
existing properties. Rather than introducing yet another such
collection-specific method to fill a UniqueNameManager, a
visitProperties method was added which calls a passed function for
each property. The existing code (e.g. getPropertyMap) would be
simpler if they all used this but the cost of calling a lambda
for each property must be considered. It would clarify the semantics
of these methods, which have a bit of variance in which properties
populate the passed collection, e.g. when there are duplicate names..
Ideally the PropertyContainer class would keep a central directory of
all properties ("static", Dynamic, and exposed by ExtensionContainer and
other derivations) and a permanent UniqueNameManager. However the
Property management is a bit of a mess making such a change a project
unto itself.
2025-02-24 10:23:53 -06:00

304 lines
9.4 KiB
C++

/***************************************************************************
* Copyright (c) 2011 Jürgen Riegel <juergen.riegel@web.de> *
* *
* This file is part of the FreeCAD CAx development system. *
* *
* This library is free software; you can redistribute it and/or *
* modify it under the terms of the GNU Library General Public *
* License as published by the Free Software Foundation; either *
* version 2 of the License, or (at your option) any later version. *
* *
* This library is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU Library General Public License for more details. *
* *
* You should have received a copy of the GNU Library General Public *
* License along with this library; see the file COPYING.LIB. If not, *
* write to the Free Software Foundation, Inc., 59 Temple Place, *
* Suite 330, Boston, MA 02111-1307, USA *
* *
***************************************************************************/
#ifndef SRC_BASE_WRITER_H_
#define SRC_BASE_WRITER_H_
#include <set>
#include <string>
#include <sstream>
#include <vector>
#include <memory>
#include <zipios++/zipoutputstream.h>
#include <Base/UniqueNameManager.h>
#include "FileInfo.h"
namespace Base
{
class Persistence;
/** The Writer class
* This is an important helper class for the store and retrieval system
* of persistent objects in FreeCAD.
* \see Base::Persistence
* \author Juergen Riegel
*/
class BaseExport Writer
{
private:
// This overrides UniqueNameManager's suffix-locating function so that the last '.' and
// everything after it is considered suffix.
class UniqueFileNameManager: public UniqueNameManager
{
protected:
std::string::const_reverse_iterator
getNameSuffixStartPosition(const std::string& name) const override
{
// This is an awkward way to do this, because the FileInfo class only yields pieces of
// the path, not delimiter positions. We can't just use fi.extension().size() because
// both "xyz" and "xyz." would yield three; we need the length of the extension
// *including its delimiter* so we use the length difference between the fileName and
// fileNamePure.
FileInfo fi(name);
return name.rbegin() + (fi.fileName().size() - fi.fileNamePure().size());
}
};
public:
Writer();
virtual ~Writer();
/// switch the writer in XML only mode (no files allowed)
void setForceXML(bool on);
/// check on state
bool isForceXML() const;
void setFileVersion(int);
int getFileVersion() const;
/// put the next entry with a give name
virtual void putNextEntry(const char* filename, const char* objName = nullptr);
/// insert a file as CDATA section in the XML file
void insertAsciiFile(const char* FileName);
/// insert a binary file BASE64 coded as CDATA section in the XML file
void insertBinFile(const char* FileName);
/// insert text string as CDATA
void insertText(const std::string& str);
/** @name additional file writing */
//@{
/// add a write request of a persistent object
std::string addFile(const char* Name, const Base::Persistence* Object);
/// process the requested file storing
virtual void writeFiles() = 0;
/// Set mode
void setMode(const std::string& mode);
/// Set modes
void setModes(const std::set<std::string>& modes);
/// Get mode
bool getMode(const std::string& mode) const;
/// Get modes
std::set<std::string> getModes() const;
/// Clear mode
void clearMode(const std::string& mode);
/// Clear modes
void clearModes();
//@}
/** @name Error handling */
//@{
void addError(const std::string&);
bool hasErrors() const;
void clearErrors();
std::vector<std::string> getErrors() const;
//@}
/** @name pretty formatting for XML */
//@{
/// get the current indentation
const char* ind() const
{
return indBuf;
}
/// increase indentation by one tab
void incInd();
/// decrease indentation by one tab
void decInd();
//@}
virtual std::ostream& Stream() = 0;
/** Create an output stream for storing character content
* The input is assumed to be valid character with
* the current XML encoding, and will be enclosed inside
* CDATA section. The stream will scan the input and
* properly escape any CDATA ending inside.
*
* @param format: If Base64Encoded, the input will be base64 encoded before storing.
* If Raw, the input is assumed to be valid character with
* the current XML encoding, and will be enclosed inside
* CDATA section. The stream will scan the input and
* properly escape any CDATA ending inside.
* @return Returns an output stream.
*
* You must call endCharStream() to end the current character stream.
*/
std::ostream& beginCharStream(CharStreamFormat format = CharStreamFormat::Raw);
/** End the current character output stream
* @return Returns the normal writer stream for convenience
*/
std::ostream& endCharStream();
/// Return the current character output stream
std::ostream& charStream();
// NOLINTBEGIN
/// name for underlying file saves
std::string ObjectName;
protected:
struct FileEntry
{
std::string FileName;
const Base::Persistence* Object;
};
std::vector<FileEntry> FileList;
UniqueFileNameManager FileNameManager;
std::vector<std::string> Errors;
std::set<std::string> Modes;
short indent {0};
char indBuf[1024] {};
bool forceXML {false};
int fileVersion {1};
// NOLINTEND
public:
Writer(const Writer&) = delete;
Writer(Writer&&) = delete;
Writer& operator=(const Writer&) = delete;
Writer& operator=(Writer&&) = delete;
private:
std::unique_ptr<std::ostream> CharStream;
CharStreamFormat charStreamFormat;
};
/** The ZipWriter class
* This is an important helper class implementation for the store and retrieval system
* of persistent objects in FreeCAD.
* \see Base::Persistence
* \author Juergen Riegel
*/
class BaseExport ZipWriter: public Writer
{
public:
explicit ZipWriter(const char* FileName);
explicit ZipWriter(std::ostream&);
~ZipWriter() override;
void writeFiles() override;
std::ostream& Stream() override
{
return ZipStream;
}
void setComment(const char* str)
{
ZipStream.setComment(str);
}
void setLevel(int level)
{
ZipStream.setLevel(level);
}
void putNextEntry(const char* filename, const char* objName = nullptr) override;
ZipWriter(const ZipWriter&) = delete;
ZipWriter(ZipWriter&&) = delete;
ZipWriter& operator=(const ZipWriter&) = delete;
ZipWriter& operator=(ZipWriter&&) = delete;
private:
zipios::ZipOutputStream ZipStream;
};
/** The StringWriter class
* This is an important helper class implementation for the store and retrieval system
* of objects in FreeCAD.
* \see Base::Persistence
* \author Juergen Riegel
*/
class BaseExport StringWriter: public Writer
{
public:
std::ostream& Stream() override
{
return StrStream;
}
std::string getString() const
{
return StrStream.str();
}
void writeFiles() override
{}
private:
std::stringstream StrStream;
};
/*! The FileWriter class
This class writes out the data into files into a given directory name.
\see Base::Persistence
\author Werner Mayer
*/
class BaseExport FileWriter: public Writer
{
public:
explicit FileWriter(const char* DirName);
~FileWriter() override;
void putNextEntry(const char* filename, const char* objName = nullptr) override;
void writeFiles() override;
std::ostream& Stream() override
{
return FileStream;
}
void close()
{
FileStream.close();
}
/*!
This method can be re-implemented in sub-classes to avoid
to write out certain objects. The default implementation
always returns true.
*/
virtual bool shouldWrite(const std::string& name, const Base::Persistence* Object) const;
FileWriter(const FileWriter&) = delete;
FileWriter(FileWriter&&) = delete;
FileWriter& operator=(const FileWriter&) = delete;
FileWriter& operator=(FileWriter&&) = delete;
protected:
// NOLINTBEGIN
std::string DirName;
std::ofstream FileStream;
// NOLINTEND
};
} // namespace Base
#endif // SRC_BASE_WRITER_H_