Files
create/src/Base/Writer.cpp
Kevin Martin 41f09db9e1 Address performance of existing unique-name generation (Part 2) (#18676)
As described in Issue 16849, the existing Tools::getUniqueName method
requires calling code to form a vector of existing names to be avoided.

This leads to poor performance both in the O(n) cost of building such a
vector and also getUniqueName's O(n) algorithm for actually generating
the unique name (where 'n' is the number of pre-existing names).

This has  particularly noticeable cost in documents with large numbers
of DocumentObjects because generating both Names and Labels for each new
object incurs this cost. During an operation such as importing this
results in an O(n^2) time spent generating names.

The other major cost is in the saving of the temporary backup file,
which uses name generation for the "files" embedded in the Zip file.
Documents can easily need several such "files" for each object in the
document.

This update includes the following changes to use the newly-added
UniqueNameManager as a replacement for the old Tools::getUniqueName
method and deletes the latter to remove any temptation to use it as
its usage model breeds inefficiency:

Eliminate Tools::getUniqueName, its local functions, and its unit tests.

Make DocumentObject naming use the new UniqueNameManager class.

Make DocumentObject Label naming use the new UniqueNameManager class.
This needs to monitor DocumentObject Labels for changes since this
property is not read-only. The special handling for the Label
property, which includes optionally forcing uniqueness and updating
links in referencing objects, has been mostly moved from
PropertyString to DocumentObject.

Add Document::containsObject(DocumentObject*) for a definitive
test of an object being in a Document. This is needed because
DocumentObjects can be in a sort of limbo (e.g. when they are in the
Undo/Redo lists) where they have a parent linkage to the Document but
should not participate in Label collision checks.

Rename Document.getStandardObjectName to getStandardObjectLabel
to better represent what it does.

Use new UniqueNameManager for Writer internal filenames within the zip
file.

Eliminate unneeded Reader::FileNames collection. The file names
already exist in the FileList collection elements. The only existing
use for the FileNames collection was to determine if there were any
files at all, and with FileList and FileNames being parallel
vectors, they both had the same length so FileList could be used
for this test..

Use UniqueNameManager for document names and labels. This uses ad hoc
UniqueNameManager objects created on the spot on the assumption that
document creation is relatively rare and there are few documents, so
although the cost is O(n), n itself is small.

Use an ad hoc UniqueNameManager to name new DymanicProperty entries.
This is only done if a property of the proposed name already exists,
since such a check is more-or-less O(log(n)), almost never finds a
collision, and avoids the O(n) building of the UniqueNameManager.
If there is a collision an ad-hoc UniqueNameManager is built
and discarded after use.
The property management classes have a bit of a mess of methods
including several to populate various collection types with all
existing properties. Rather than introducing yet another such
collection-specific method to fill a UniqueNameManager, a
visitProperties method was added which calls a passed function for
each property. The existing code (e.g. getPropertyMap) would be
simpler if they all used this but the cost of calling a lambda
for each property must be considered. It would clarify the semantics
of these methods, which have a bit of variance in which properties
populate the passed collection, e.g. when there are duplicate names..
Ideally the PropertyContainer class would keep a central directory of
all properties ("static", Dynamic, and exposed by ExtensionContainer and
other derivations) and a permanent UniqueNameManager. However the
Property management is a bit of a mess making such a change a project
unto itself.
2025-02-24 10:23:53 -06:00

399 lines
11 KiB
C++

/***************************************************************************
* Copyright (c) 2011 Jürgen Riegel <juergen.riegel@web.de> *
* *
* This file is part of the FreeCAD CAx development system. *
* *
* This library is free software; you can redistribute it and/or *
* modify it under the terms of the GNU Library General Public *
* License as published by the Free Software Foundation; either *
* version 2 of the License, or (at your option) any later version. *
* *
* This library is distributed in the hope that it will be useful, *
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
* GNU Library General Public License for more details. *
* *
* You should have received a copy of the GNU Library General Public *
* License along with this library; see the file COPYING.LIB. If not, *
* write to the Free Software Foundation, Inc., 59 Temple Place, *
* Suite 330, Boston, MA 02111-1307, USA *
* *
***************************************************************************/
#include "PreCompiled.h"
#ifndef _PreComp_
#include <memory>
#include <set>
#include <vector>
#include <string>
#endif
#include <limits>
#include <locale>
#include <iomanip>
#include "Writer.h"
#include "Base64.h"
#include "Base64Filter.h"
#include "Exception.h"
#include "FileInfo.h"
#include "Persistence.h"
#include "Stream.h"
#include "Tools.h"
#include <boost/iostreams/filtering_stream.hpp>
#include <zipios++/zipinputstream.h>
using namespace Base;
// boost iostream filter to escape ']]>' in text file saved into CDATA section.
// It does not check if the character is valid utf8 or not.
struct cdata_filter
{
using char_type = char;
using category = boost::iostreams::output_filter_tag;
template<typename Device>
inline bool put(Device& dev, char ch)
{
switch (state) {
case 0:
case 1:
if (ch == ']') {
++state;
}
else {
state = 0;
}
break;
case 2:
if (ch == '>') {
static const char escape[] = "]]><![CDATA[";
boost::iostreams::write(dev, escape, sizeof(escape) - 1);
}
state = 0;
break;
}
return boost::iostreams::put(dev, ch);
}
int state = 0;
};
// ---------------------------------------------------------------------------
// Writer: Constructors and Destructor
// ---------------------------------------------------------------------------
Writer::Writer()
{
indBuf[0] = '\0';
}
Writer::~Writer() = default;
std::ostream& Writer::beginCharStream(CharStreamFormat format)
{
if (CharStream) {
throw Base::RuntimeError("Writer::beginCharStream(): invalid state");
}
charStreamFormat = format;
if (format == CharStreamFormat::Base64Encoded) {
CharStream = create_base64_encoder(Stream(), Base::base64DefaultBufferSize);
}
else {
Stream() << "<![CDATA[";
CharStream = std::make_unique<boost::iostreams::filtering_ostream>();
auto* filteredStream = dynamic_cast<boost::iostreams::filtering_ostream*>(CharStream.get());
filteredStream->push(cdata_filter());
filteredStream->push(Stream());
*filteredStream << std::setprecision(std::numeric_limits<double>::digits10 + 1);
}
return *CharStream;
}
std::ostream& Writer::endCharStream()
{
if (CharStream) {
CharStream.reset();
if (charStreamFormat == CharStreamFormat::Raw) {
Stream() << "]]>";
}
}
return Stream();
}
std::ostream& Writer::charStream()
{
if (!CharStream) {
throw Base::RuntimeError("Writer::endCharStream(): no current character stream");
}
return *CharStream;
}
void Writer::insertText(const std::string& str)
{
beginCharStream() << str;
endCharStream();
}
void Writer::insertAsciiFile(const char* FileName)
{
Base::FileInfo fi(FileName);
Base::ifstream from(fi);
if (!from) {
throw Base::FileException("Writer::insertAsciiFile() Could not open file!");
}
Stream() << "<![CDATA[";
char ch {};
while (from.get(ch)) {
Stream().put(ch);
}
Stream() << "]]>" << std::endl;
}
void Writer::insertBinFile(const char* FileName)
{
Base::FileInfo fi(FileName);
Base::ifstream from(fi, std::ios::in | std::ios::binary | std::ios::ate);
if (!from) {
throw Base::FileException("Writer::insertAsciiFile() Could not open file!");
}
Stream() << "<![CDATA[";
std::ifstream::pos_type fileSize = from.tellg();
from.seekg(0, std::ios::beg);
std::vector<unsigned char> bytes(static_cast<size_t>(fileSize));
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast)
from.read(reinterpret_cast<char*>(bytes.data()), fileSize);
Stream() << Base::base64_encode(bytes.data(), static_cast<unsigned int>(fileSize));
Stream() << "]]>" << std::endl;
}
void Writer::setForceXML(bool on)
{
forceXML = on;
}
bool Writer::isForceXML() const
{
return forceXML;
}
void Writer::setFileVersion(int version)
{
fileVersion = version;
}
int Writer::getFileVersion() const
{
return fileVersion;
}
void Writer::setMode(const std::string& mode)
{
Modes.insert(mode);
}
void Writer::setModes(const std::set<std::string>& modes)
{
Modes = modes;
}
bool Writer::getMode(const std::string& mode) const
{
std::set<std::string>::const_iterator it = Modes.find(mode);
return (it != Modes.end());
}
std::set<std::string> Writer::getModes() const
{
return Modes;
}
void Writer::clearMode(const std::string& mode)
{
std::set<std::string>::iterator it = Modes.find(mode);
if (it != Modes.end()) {
Modes.erase(it);
}
}
void Writer::clearModes()
{
Modes.clear();
}
void Writer::addError(const std::string& msg)
{
Errors.push_back(msg);
}
bool Writer::hasErrors() const
{
return (!Errors.empty());
}
void Writer::clearErrors()
{
Errors.clear();
}
std::vector<std::string> Writer::getErrors() const
{
return Errors;
}
std::string Writer::addFile(const char* Name, const Base::Persistence* Object)
{
// always check isForceXML() before requesting a file!
assert(!isForceXML());
FileEntry temp;
temp.FileName = Name ? Name : "";
if (FileNameManager.containsName(temp.FileName)) {
temp.FileName = FileNameManager.makeUniqueName(temp.FileName);
}
temp.Object = Object;
FileList.push_back(temp);
FileNameManager.addExactName(temp.FileName);
// return the unique file name
return temp.FileName;
}
void Writer::incInd()
{
if (indent < 1020) {
indBuf[indent] = ' ';
indBuf[indent + 1] = ' ';
indBuf[indent + 2] = ' ';
indBuf[indent + 3] = ' ';
indBuf[indent + 4] = '\0';
indent += 4;
}
}
void Writer::decInd()
{
if (indent >= 4) {
indent -= 4;
}
else {
indent = 0;
}
indBuf[indent] = '\0';
}
void Writer::putNextEntry(const char* file, const char* obj)
{
ObjectName = obj ? obj : file;
}
// ----------------------------------------------------------------------------
ZipWriter::ZipWriter(const char* FileName)
: ZipStream(FileName)
{
#ifdef _MSC_VER
ZipStream.imbue(std::locale::empty());
#else
ZipStream.imbue(std::locale::classic());
#endif
ZipStream.precision(std::numeric_limits<double>::digits10 + 1);
ZipStream.setf(std::ios::fixed, std::ios::floatfield);
}
ZipWriter::ZipWriter(std::ostream& os)
: ZipStream(os)
{
#ifdef _MSC_VER
ZipStream.imbue(std::locale::empty());
#else
ZipStream.imbue(std::locale::classic());
#endif
ZipStream.precision(std::numeric_limits<double>::digits10 + 1);
ZipStream.setf(std::ios::fixed, std::ios::floatfield);
}
void ZipWriter::putNextEntry(const char* file, const char* obj)
{
Writer::putNextEntry(file, obj);
ZipStream.putNextEntry(file);
}
void ZipWriter::writeFiles()
{
// use a while loop because it is possible that while
// processing the files new ones can be added
size_t index = 0;
while (index < FileList.size()) {
FileEntry entry = FileList[index];
putNextEntry(entry.FileName.c_str());
indent = 0;
indBuf[0] = 0;
entry.Object->SaveDocFile(*this);
index++;
}
}
ZipWriter::~ZipWriter()
{
ZipStream.close();
}
// ----------------------------------------------------------------------------
FileWriter::FileWriter(const char* DirName)
: DirName(DirName)
{}
FileWriter::~FileWriter() = default;
void FileWriter::putNextEntry(const char* file, const char* obj)
{
Writer::putNextEntry(file, obj);
std::string fileName = DirName + "/" + file;
this->FileStream.open(fileName.c_str(), std::ios::out | std::ios::binary);
}
bool FileWriter::shouldWrite(const std::string& /*name*/, const Base::Persistence* /*obj*/) const
{
return true;
}
void FileWriter::writeFiles()
{
// use a while loop because it is possible that while
// processing the files new ones can be added
size_t index = 0;
this->FileStream.close();
while (index < FileList.size()) {
FileEntry entry = FileList[index];
if (shouldWrite(entry.FileName, entry.Object)) {
std::string filePath = entry.FileName;
std::string::size_type pos = 0;
while ((pos = filePath.find('/', pos)) != std::string::npos) {
std::string dirName = DirName + "/" + filePath.substr(0, pos);
pos++;
Base::FileInfo fi(dirName);
fi.createDirectory();
}
putNextEntry(entry.FileName.c_str());
indent = 0;
indBuf[0] = 0;
entry.Object->SaveDocFile(*this);
this->FileStream.close();
}
index++;
}
}