* Address the poor performance of the existing unique-name generation
As described in Issue 16849, the existing Tools::getUniqueName method
requires calling code to form a vector of existing names to be avoided.
This leads to poor performance both in the O(n) cost of building such a
vector and also getUniqueName's O(n) algorithm for actually generating
the unique name (where 'n' is the number of pre-existing names).
This has particularly noticeable cost in documents with large numbers
of DocumentObjects because generating both Names and Labels for each new
object incurs this cost. During an operation such as importing this
results in an O(n^2) time spent generating names.
The other major cost is in the saving of the temporary backup file,
which uses name generation for the "files" embedded in the Zip file.
Documents can easily need several such "files" for each object in the
document.
This update includes the following changes:
Create UniqueNameManager to keep a list of existing names organized in
a manner that eases unique-name generation. This class essentially acts
as a set of names, with the ability to add and remove names and check if
a name is already there, with the added ability to take a prototype name
and generate a unique form for it which is not already in the set.
Eliminate Tools::getUniqueName
Make DocumentObject naming use the new UniqueNameManager class
Make DocumentObject Label naming use the new UniqueNameManager class.
Labels are not always unique; unique labels are generated if the
settings at the time request it (and other conditions). Because of this
the Label management requires additionally keeping a map of counts
for labels which already exist more than once.
These collections are maintained via notifications of value changes on
the Label properties of the objects in the document.
Add Document::containsObject(DocumentObject*) for a definitive
test of an object being in a Document. This is needed because
DocumentObjects can be in a sort of limbo (e.g. when they are in the
Undo/Redo lists) where they have a parent linkage to the Document but
should not participate in Label collision checks.
Rename Document.getStandardObjectName to getStandardObjectLabel
to better represent what it does.
Use new UniqueNameManager for Writer internal filenames within the zip
file.
Eliminate unneeded Reader::FileNames collection. The file names
already exist in the FileList collection elements. The only existing
use for the FileNames collection was to determine if there were any
files at all, and with FileList and FileNames being parallel
vectors, they both had the same length so FileList could be used
for this test..
Use UniqueNameManager for document names and labels. This uses ad hoc
UniqueNameManager objects created on the spot on the assumption that
document creation is relatively rare and there are few documents, so
although the cost is O(n), n itself is small.
Use an ad hoc UniqueNameManager to name new DymanicProperty entries.
This is only done if a property of the proposed name already exists,
since such a check is more-or-less O(log(n)), almost never finds a
collision, and avoids the O(n) building of the UniqueNameManager.
If there is a collision an ad-hoc UniqueNameManager is built
and discarded after use.
The property management classes have a bit of a mess of methods
including several to populate various collection types with all
existing properties. Rather than introducing yet another such
collection-specific method to fill a UniqueNameManager, a
visitProperties method was added which calls a passed function for
each property. The existing code would be simpler if existing
fill-container methods all used this.
Ideally the PropertyContainer class would keep a central directory of
all properties ("static", Dynamic, and exposed by ExtensionContainer and
other derivations) and a permanent UniqueNameManager. However the
Property management is a bit of a mess making such a change a project
unto itself.
The unit tests for Tools:getUniqueName have been changed to test
UniqueNameManager.makeUniqueName instead.
This revealed a small regression insofar as passing a prototype name
like "xyz1234" to the old code would yield "xyz1235" whether or
not "xyz1234" already existed, while the new code will return the next
name above the currently-highest name on the "xyz" model, which could
be "xyz" or "xyz1".
* Correct wrong case on include path
* Implement suggested code changes
Also change the semantics of visitProperties to not have any short-circuit return
* Remove reference through undefined iterator
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix up some comments for DOxygen
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
494 lines
16 KiB
C++
494 lines
16 KiB
C++
/***************************************************************************
|
|
* Copyright (c) 2009 Werner Mayer <wmayer[at]users.sourceforge.net> *
|
|
* *
|
|
* This file is part of the FreeCAD CAx development system. *
|
|
* *
|
|
* This library is free software; you can redistribute it and/or *
|
|
* modify it under the terms of the GNU Library General Public *
|
|
* License as published by the Free Software Foundation; either *
|
|
* version 2 of the License, or (at your option) any later version. *
|
|
* *
|
|
* This library is distributed in the hope that it will be useful, *
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of *
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
|
|
* GNU Library General Public License for more details. *
|
|
* *
|
|
* You should have received a copy of the GNU Library General Public *
|
|
* License along with this library; see the file COPYING.LIB. If not, *
|
|
* write to the Free Software Foundation, Inc., 59 Temple Place, *
|
|
* Suite 330, Boston, MA 02111-1307, USA *
|
|
* *
|
|
***************************************************************************/
|
|
|
|
|
|
#include "PreCompiled.h"
|
|
#ifndef _PreComp_
|
|
#include <sstream>
|
|
#include <locale>
|
|
#include <iostream>
|
|
#include <QDateTime>
|
|
#endif
|
|
|
|
#include "PyExport.h"
|
|
#include "Interpreter.h"
|
|
#include "Tools.h"
|
|
|
|
void Base::UniqueNameManager::PiecewiseSparseIntegerSet::Add(uint value)
|
|
{
|
|
etype newSpan(value, 1);
|
|
iterator above = Spans.lower_bound(newSpan);
|
|
if (above != Spans.end() && above->first <= value) {
|
|
// The found span includes value so there is nothing to do as it is already in the set.
|
|
return;
|
|
}
|
|
|
|
// Set below to the next span down, if any
|
|
iterator below;
|
|
if (above == Spans.begin()) {
|
|
below = Spans.end();
|
|
}
|
|
else {
|
|
below = above;
|
|
--below;
|
|
}
|
|
|
|
if (above != Spans.end() && below != Spans.end()
|
|
&& above->first - below->first + 1 == below->second) {
|
|
// below and above have a gap of exactly one between them, and this must be value
|
|
// so we coalesce the two spans (and the gap) into one.
|
|
newSpan = etype(below->first, below->second + above->second + 1);
|
|
Spans.erase(above);
|
|
above = Spans.erase(below);
|
|
}
|
|
if (below != Spans.end() && value - below->first == below->second) {
|
|
// value is adjacent to the end of below, so just expand below by one
|
|
newSpan = etype(below->first, below->second + 1);
|
|
above = Spans.erase(below);
|
|
}
|
|
else if (above != Spans.end() && above->first - value == 1) {
|
|
// value is adjacent to the start of above, so juse expand above down by one
|
|
newSpan = etype(above->first - 1, above->second + 1);
|
|
above = Spans.erase(above);
|
|
}
|
|
// else value is not adjacent to any existing span, so just make anew span for it
|
|
Spans.insert(above, newSpan);
|
|
}
|
|
void Base::UniqueNameManager::PiecewiseSparseIntegerSet::Remove(uint value)
|
|
{
|
|
etype newSpan(value, 1);
|
|
iterator at = Spans.lower_bound(newSpan);
|
|
if (at == Spans.end() || at->first > value) {
|
|
// The found span does not include value so there is nothing to do, as it is already not in
|
|
// the set.
|
|
return;
|
|
}
|
|
if (at->second == 1) {
|
|
// value is the only in this span, just remove the span
|
|
Spans.erase(at);
|
|
}
|
|
else if (at->first == value) {
|
|
// value is the first in this span, trim the lower end
|
|
etype replacement(at->first + 1, at->second - 1);
|
|
Spans.insert(Spans.erase(at), replacement);
|
|
}
|
|
else if (value - at->first == at->second - 1) {
|
|
// value is the last in this span, trim the upper end
|
|
etype replacement(at->first, at->second - 1);
|
|
Spans.insert(Spans.erase(at), replacement);
|
|
}
|
|
else {
|
|
// value is in the moddle of the span, so we must split it.
|
|
etype firstReplacement(at->first, value - at->first);
|
|
etype secondReplacement(value + 1, at->second - ((value + 1) - at->first));
|
|
// Because erase returns the iterator after the erased element, and insert returns the
|
|
// iterator for the inserted item, we want to insert secondReplacement first.
|
|
Spans.insert(Spans.insert(Spans.erase(at), secondReplacement), firstReplacement);
|
|
}
|
|
}
|
|
bool Base::UniqueNameManager::PiecewiseSparseIntegerSet::Contains(uint value) const
|
|
{
|
|
iterator at = Spans.lower_bound(etype(value, 1));
|
|
return at != Spans.end() && at->first <= value;
|
|
}
|
|
|
|
std::tuple<uint, uint> Base::UniqueNameManager::decomposeName(const std::string& name,
|
|
std::string& baseNameOut,
|
|
std::string& nameSuffixOut) const
|
|
{
|
|
auto suffixStart = std::make_reverse_iterator(GetNameSuffixStartPosition(name));
|
|
nameSuffixOut = name.substr(name.crend() - suffixStart);
|
|
auto digitsStart = std::find_if_not(suffixStart, name.crend(), [](char c) {
|
|
return std::isdigit(c);
|
|
});
|
|
baseNameOut = name.substr(0, name.crend() - digitsStart);
|
|
uint digitCount = digitsStart - suffixStart;
|
|
if (digitCount == 0) {
|
|
// No digits in name
|
|
return std::tuple<uint, uint> {0, 0};
|
|
}
|
|
else {
|
|
return std::tuple<uint, uint> {
|
|
digitCount,
|
|
std::stoul(name.substr(name.crend() - digitsStart, digitCount))};
|
|
}
|
|
}
|
|
void Base::UniqueNameManager::addExactName(const std::string& name)
|
|
{
|
|
std::string baseName;
|
|
std::string nameSuffix;
|
|
uint digitCount;
|
|
uint digitsValue;
|
|
std::tie(digitCount, digitsValue) = decomposeName(name, baseName, nameSuffix);
|
|
baseName += nameSuffix;
|
|
auto baseNameEntry = UniqueSeeds.find(baseName);
|
|
if (baseNameEntry == UniqueSeeds.end()) {
|
|
// First use of baseName
|
|
baseNameEntry =
|
|
UniqueSeeds.emplace(baseName, std::vector<PiecewiseSparseIntegerSet>()).first;
|
|
}
|
|
if (digitCount >= baseNameEntry->second.size()) {
|
|
// First use of this digitCount
|
|
baseNameEntry->second.resize(digitCount + 1);
|
|
}
|
|
PiecewiseSparseIntegerSet& baseNameAndDigitCountEntry = baseNameEntry->second[digitCount];
|
|
// Name should not already be there
|
|
assert(!baseNameAndDigitCountEntry.Contains(digitsValue));
|
|
baseNameAndDigitCountEntry.Add(digitsValue);
|
|
}
|
|
std::string Base::UniqueNameManager::makeUniqueName(const std::string& modelName,
|
|
int minDigits) const
|
|
{
|
|
std::string namePrefix;
|
|
std::string nameSuffix;
|
|
decomposeName(modelName, namePrefix, nameSuffix);
|
|
std::string baseName = namePrefix + nameSuffix;
|
|
auto baseNameEntry = UniqueSeeds.find(baseName);
|
|
if (baseNameEntry == UniqueSeeds.end()) {
|
|
// First use of baseName, just return it with no unique digits
|
|
return baseName;
|
|
}
|
|
// We don't care about the digit count of the suggested name, we always use at least the most
|
|
// digits ever used before.
|
|
int digitCount = baseNameEntry->second.size() - 1;
|
|
uint digitsValue;
|
|
if (digitCount < minDigits) {
|
|
// Caller is asking for more digits than we have in any registered name.
|
|
// We start the longer digit string at 000...0001 even though we might have shorter strings
|
|
// with larger numeric values.
|
|
digitCount = minDigits;
|
|
digitsValue = 1;
|
|
}
|
|
else {
|
|
digitsValue = baseNameEntry->second[digitCount].Next();
|
|
}
|
|
std::string digits = std::to_string(digitsValue);
|
|
if (digitCount > digits.size()) {
|
|
namePrefix += std::string(digitCount - digits.size(), '0');
|
|
}
|
|
return namePrefix + digits + nameSuffix;
|
|
}
|
|
|
|
void Base::UniqueNameManager::removeExactName(const std::string& name)
|
|
{
|
|
std::string baseName;
|
|
std::string nameSuffix;
|
|
uint digitCount;
|
|
uint digitsValue;
|
|
std::tie(digitCount, digitsValue) = decomposeName(name, baseName, nameSuffix);
|
|
baseName += nameSuffix;
|
|
auto baseNameEntry = UniqueSeeds.find(baseName);
|
|
if (baseNameEntry == UniqueSeeds.end()) {
|
|
// name must not be registered, so nothing to do.
|
|
return;
|
|
}
|
|
auto& digitValueSets = baseNameEntry->second;
|
|
if (digitCount >= digitValueSets.size()) {
|
|
// First use of this digitCount, name must not be registered, so nothing to do.
|
|
return;
|
|
}
|
|
digitValueSets[digitCount].Remove(digitsValue);
|
|
// an element of digitValueSets may now be newly empty and so may other elements below it
|
|
// Prune off all such trailing empty entries.
|
|
auto lastNonemptyEntry =
|
|
std::find_if(digitValueSets.crbegin(), digitValueSets.crend(), [](auto& it) {
|
|
return it.Any();
|
|
});
|
|
if (lastNonemptyEntry == digitValueSets.crend()) {
|
|
// All entries are empty, so the entire baseName can be forgotten.
|
|
UniqueSeeds.erase(baseName);
|
|
}
|
|
else {
|
|
digitValueSets.resize(digitValueSets.crend() - lastNonemptyEntry);
|
|
}
|
|
}
|
|
|
|
bool Base::UniqueNameManager::containsName(const std::string& name) const
|
|
{
|
|
std::string baseName;
|
|
std::string nameSuffix;
|
|
uint digitCount;
|
|
uint digitsValue;
|
|
std::tie(digitCount, digitsValue) = decomposeName(name, baseName, nameSuffix);
|
|
baseName += nameSuffix;
|
|
auto baseNameEntry = UniqueSeeds.find(baseName);
|
|
if (baseNameEntry == UniqueSeeds.end()) {
|
|
// base name is not registered
|
|
return false;
|
|
}
|
|
if (digitCount >= baseNameEntry->second.size()) {
|
|
// First use of this digitCount, name must not be registered, so not in collection
|
|
return false;
|
|
}
|
|
return baseNameEntry->second[digitCount].Contains(digitsValue);
|
|
}
|
|
std::string Base::Tools::getIdentifier(const std::string& name)
|
|
{
|
|
if (name.empty()) {
|
|
return "_";
|
|
}
|
|
// check for first character whether it's a digit
|
|
std::string CleanName = name;
|
|
if (!CleanName.empty() && CleanName[0] >= 48 && CleanName[0] <= 57) {
|
|
CleanName[0] = '_';
|
|
}
|
|
// strip illegal chars
|
|
for (char& it : CleanName) {
|
|
if (!((it >= 48 && it <= 57) || // number
|
|
(it >= 65 && it <= 90) || // uppercase letter
|
|
(it >= 97 && it <= 122))) { // lowercase letter
|
|
it = '_'; // it's neither number nor letter
|
|
}
|
|
}
|
|
|
|
return CleanName;
|
|
}
|
|
|
|
std::wstring Base::Tools::widen(const std::string& str)
|
|
{
|
|
std::wostringstream wstm;
|
|
const std::ctype<wchar_t>& ctfacet = std::use_facet<std::ctype<wchar_t>>(wstm.getloc());
|
|
for (char i : str) {
|
|
wstm << ctfacet.widen(i);
|
|
}
|
|
return wstm.str();
|
|
}
|
|
|
|
std::string Base::Tools::narrow(const std::wstring& str)
|
|
{
|
|
std::ostringstream stm;
|
|
const std::ctype<char>& ctfacet = std::use_facet<std::ctype<char>>(stm.getloc());
|
|
for (wchar_t i : str) {
|
|
stm << ctfacet.narrow(i, 0);
|
|
}
|
|
return stm.str();
|
|
}
|
|
|
|
std::string Base::Tools::escapedUnicodeFromUtf8(const char* s)
|
|
{
|
|
Base::PyGILStateLocker lock;
|
|
std::string escapedstr;
|
|
|
|
PyObject* unicode = PyUnicode_FromString(s);
|
|
if (!unicode) {
|
|
return escapedstr;
|
|
}
|
|
|
|
PyObject* escaped = PyUnicode_AsUnicodeEscapeString(unicode);
|
|
if (escaped) {
|
|
escapedstr = std::string(PyBytes_AsString(escaped));
|
|
Py_DECREF(escaped);
|
|
}
|
|
|
|
Py_DECREF(unicode);
|
|
return escapedstr;
|
|
}
|
|
|
|
std::string Base::Tools::escapedUnicodeToUtf8(const std::string& s)
|
|
{
|
|
Base::PyGILStateLocker lock;
|
|
std::string string;
|
|
|
|
PyObject* unicode =
|
|
PyUnicode_DecodeUnicodeEscape(s.c_str(), static_cast<Py_ssize_t>(s.size()), "strict");
|
|
if (!unicode) {
|
|
return string;
|
|
}
|
|
if (PyUnicode_Check(unicode)) {
|
|
string = PyUnicode_AsUTF8(unicode);
|
|
}
|
|
Py_DECREF(unicode);
|
|
return string;
|
|
}
|
|
|
|
std::string Base::Tools::escapeQuotesFromString(const std::string& s)
|
|
{
|
|
std::string result;
|
|
size_t len = s.size();
|
|
for (size_t i = 0; i < len; ++i) {
|
|
switch (s.at(i)) {
|
|
case '\"':
|
|
result += "\\\"";
|
|
break;
|
|
case '\'':
|
|
result += "\\\'";
|
|
break;
|
|
default:
|
|
result += s.at(i);
|
|
break;
|
|
}
|
|
}
|
|
return result;
|
|
}
|
|
|
|
QString Base::Tools::escapeEncodeString(const QString& s)
|
|
{
|
|
QString result;
|
|
const int len = s.length();
|
|
result.reserve(int(len * 1.1));
|
|
for (int i = 0; i < len; ++i) {
|
|
if (s.at(i) == QLatin1Char('\\')) {
|
|
result += QLatin1String("\\\\");
|
|
}
|
|
else if (s.at(i) == QLatin1Char('\"')) {
|
|
result += QLatin1String("\\\"");
|
|
}
|
|
else if (s.at(i) == QLatin1Char('\'')) {
|
|
result += QLatin1String("\\\'");
|
|
}
|
|
else {
|
|
result += s.at(i);
|
|
}
|
|
}
|
|
result.squeeze();
|
|
return result;
|
|
}
|
|
|
|
std::string Base::Tools::escapeEncodeString(const std::string& s)
|
|
{
|
|
std::string result;
|
|
size_t len = s.size();
|
|
for (size_t i = 0; i < len; ++i) {
|
|
switch (s.at(i)) {
|
|
case '\\':
|
|
result += "\\\\";
|
|
break;
|
|
case '\"':
|
|
result += "\\\"";
|
|
break;
|
|
case '\'':
|
|
result += "\\\'";
|
|
break;
|
|
default:
|
|
result += s.at(i);
|
|
break;
|
|
}
|
|
}
|
|
return result;
|
|
}
|
|
|
|
QString Base::Tools::escapeEncodeFilename(const QString& s)
|
|
{
|
|
QString result;
|
|
const int len = s.length();
|
|
result.reserve(int(len * 1.1));
|
|
for (int i = 0; i < len; ++i) {
|
|
if (s.at(i) == QLatin1Char('\"')) {
|
|
result += QLatin1String("\\\"");
|
|
}
|
|
else if (s.at(i) == QLatin1Char('\'')) {
|
|
result += QLatin1String("\\\'");
|
|
}
|
|
else {
|
|
result += s.at(i);
|
|
}
|
|
}
|
|
result.squeeze();
|
|
return result;
|
|
}
|
|
|
|
std::string Base::Tools::escapeEncodeFilename(const std::string& s)
|
|
{
|
|
std::string result;
|
|
size_t len = s.size();
|
|
for (size_t i = 0; i < len; ++i) {
|
|
switch (s.at(i)) {
|
|
case '\"':
|
|
result += "\\\"";
|
|
break;
|
|
case '\'':
|
|
result += "\\\'";
|
|
break;
|
|
default:
|
|
result += s.at(i);
|
|
break;
|
|
}
|
|
}
|
|
return result;
|
|
}
|
|
|
|
std::string Base::Tools::quoted(const char* name)
|
|
{
|
|
std::stringstream str;
|
|
str << "\"" << name << "\"";
|
|
return str.str();
|
|
}
|
|
|
|
std::string Base::Tools::quoted(const std::string& name)
|
|
{
|
|
std::stringstream str;
|
|
str << "\"" << name << "\"";
|
|
return str.str();
|
|
}
|
|
|
|
std::string Base::Tools::joinList(const std::vector<std::string>& vec, const std::string& sep)
|
|
{
|
|
std::stringstream str;
|
|
for (const auto& it : vec) {
|
|
str << it << sep;
|
|
}
|
|
return str.str();
|
|
}
|
|
|
|
std::string Base::Tools::currentDateTimeString()
|
|
{
|
|
return QDateTime::currentDateTime()
|
|
.toTimeSpec(Qt::OffsetFromUTC)
|
|
.toString(Qt::ISODate)
|
|
.toStdString();
|
|
}
|
|
|
|
std::vector<std::string> Base::Tools::splitSubName(const std::string& subname)
|
|
{
|
|
// Turns 'Part.Part001.Body.Pad.Edge1'
|
|
// Into ['Part', 'Part001', 'Body', 'Pad', 'Edge1']
|
|
std::vector<std::string> subNames;
|
|
std::string subName;
|
|
std::istringstream subNameStream(subname);
|
|
while (std::getline(subNameStream, subName, '.')) {
|
|
subNames.push_back(subName);
|
|
}
|
|
|
|
// Check if the last character of the input string is the delimiter.
|
|
// If so, add an empty string to the subNames vector.
|
|
// Because the last subname is the element name and can be empty.
|
|
if (!subname.empty() && subname.back() == '.') {
|
|
subNames.push_back(""); // Append empty string for trailing dot.
|
|
}
|
|
|
|
return subNames;
|
|
}
|
|
|
|
// ------------------------------------------------------------------------------------------------
|
|
|
|
void Base::ZipTools::rewrite(const std::string& source, const std::string& target)
|
|
{
|
|
Base::PyGILStateLocker lock;
|
|
PyObject* module = PyImport_ImportModule("freecad.utils_zip");
|
|
if (!module) {
|
|
throw Py::Exception();
|
|
}
|
|
|
|
Py::Module commands(module, true);
|
|
commands.callMemberFunction("rewrite", Py::TupleN(Py::String(source), Py::String(target)));
|
|
}
|