V‑Spark 4.0
V‑Spark 4.0 is a major release with significantly changed architecture, along with numerous improvements and bug fixes.
V‑Spark now runs on CentOS 7. Changes to V‑Spark’s technical foundation are primarily related to the CentOS 7 upgrade and are the main drivers for this release.
V‑Spark services are now managed by
systemd
.SysVinit
has been phased out and replaced withsystemd
as part of the upgrade to CentOS 7.This update removes
init.d
and intermediate scripts where possible, and introduces a hierarchy of individually managed services defined in unit files. The new service hierarchy can be represented as follows:V‑Spark services can be controlled as a group by using the top-level
vspark.service
, which acts as a wrapper around the others. To start this service, use the following command:$ sudo systemctl start vspark
The services started by
vspark.service
are specified in theWants
directive in the/usr/lib/systemd/system/vspark.service
file. Ifvspark.service
is enabled, it will be started automatically on boot and reboot, and any services in theWants
directive will also try to start. Any excluded services will not start.Each service can be started, stopped, and restarted separately as needed without invoking
vspark.service
usingsystemctl
, as in these examples:$ sudo systemctl start vspark-back $ sudo systemctl start vspark-front $ sudo systemctl start vspark-jobmgr
Other options for
systemctl vspark
include the following:$ sudo systemctl restart vspark $ sudo systemctl stop vspark $ sudo systemctl status vspark
V‑Spark now has a dedicated command line script for system administration. To use it, run
sudo vspark-admin OPTION
with any of the parameters in the following table.Table 1. Option parameters for vspark-admin Parameter
Description
status
Show the status of all V‑Spark services.
status
OPTIONShow the status of a specific service.
To specify that service, replace OPTION with one of these parameters:
front
,back
,jobmgr
,sccluster
version
Display the current version of V‑Spark that is installed.
core-check
Shows whether or not service dependencies are in allowed version range.
core-update
Shows and applies available updates for V‑Spark dependencies.
To preview required changes to core schemas and data structures, invoke
core-update
as in this example:$ sudo vspark-admin core-update
To apply required changes to core schemas and data structures, invoke
core-update
with the commit parameter-c
as in this example:$ sudo vspark-admin core-update -c
check-health
Shows status information about service dependencies.
show-config
Display a list of every config setting (represented in value pairs) that V‑Spark is using.
Note: Any configuration changes made to a running version of V‑Spark must be reloaded for changes to take effect.For example, to check the status of V‑Spark's front-end services, run the following command:
$ sudo vspark-admin status front
V‑Spark now attempts to reconnect to service dependencies automatically. When certain required services become unavailable, V‑Spark will attempt to reconnect to them in order to minimize service disruption. By default, there is no limit to the number of times V‑Spark will attempt to reconnect.
V‑Spark now uses improved logic to handle connectivity issues with the license server. When revalidation fails due to loss of network connectivity or other scenarios, V‑Spark will respect the time to live (TTL) value associated with the requested license. As long as the TTL with the license server hasn't expired, the license will stay validated in order to minimize service interruptions.
V‑Spark now includes the
voci-spark-tools
package, which contains two utilities—Datatool and Config Manager—that facilitate the transfer of installation data and configuration settings.Datatool is a command-line tool for importing (
load
) and exporting (backup
) audio and transcription data. For more details, see theDatatool.MD
file in theutils/datatool
directory.Config Manager is a script for importing and exporting installation configuration settings. This script can be found in the
utils/
directory.
Improved V‑Spark security to help protect against remote code executions. For more secure file handling, name validation rules apply to audio files and ZIP archives uploaded for transcription via the GUI and API, and also to files uploaded to the Audio Evaluator. Files inside a zip are not checked. This feature was implemented with release version 4.0.1-3. Default naming rules forbid these characters:
#*<>:?/\|{}$!'`"=^
Filename validation is enabled by default. To disable it, set the
filename_validation
system configuration setting to off. To define custom character requirements, specify a regular expression viafilename_validation_pattern
.Improved V‑Spark security to help protect against SQL injections.
V‑Spark can now log which users have viewed the File Details page. To minimize excessive or unnecessary logging noise, this setting is off by default. To enable it, set the new system configuration option
audit_filedetails_pageviews
to on. When enabled, anaudit
entry is logged in the Activity Log and recorded inserver.log
as anINFO
entry.Links to HTML documentation in V‑Spark's help menu now point to Voci's Online Help. This update deprecates the
release_note
configuration option because the HTML version of the release notes has moved from the installation to the website. Note that therelease_note_dl
configuration still exists because PDF versions of documentation are still bundled with the software.
Announcements with V‑Spark 4.0
-
System architecture has significantly changed with the V‑Spark 4.0 release. System administrators should take note of the new front-end libraries and other dependencies listed in the following table as they evaluate installation requirements.
Table 2. Changes in V‑Spark dependencies from version 3.5 to 4.0 Dependency
Version used with V‑Spark 3.5
Version used with V‑Spark 4.0
CentOS
6
7
Node
6
14
Elasticsearch
5.6
7.6
MySQL and MariaDB
MySQL 5.1
MariaDB 5.5 (EPEL repository)
Redis
3.2
3.2 (no changes, EPEL repository)
Moment.js
2.17.1
2.29.1
Bootstrap
3.2.0
3.4.1
jQuery
2.2.4
3.6.0
Note: Customers upgrading from a 3.4.3 or 3.5 V‑Spark system running on CentOS 6 should contact customer support for recommended upgrade paths.Preliminary testing shows that enhancements in the 4.0 release lead to a 10% average performance improvement during data ingestion over previous versions.
-
The location of some V‑Spark log files has changed. Some logged information now goes into CentOS 7's
journald
and can be visualized usingjournalctl
from the command line, as in the following example:$ sudo journalctl --unit=vspark-jobmgr
This change keeps V‑Spark log files consistent with
systemd
service handling best practices. V‑Spark 4.0 log locations are listed in the following tables:Table 3. /var/log/vspark/ Original Location
New Location
/var/log/vspark/front-err.log
journal
/var/log/vspark/back-err.log
journal
/var/log/vspark/backend_stdio.log
No change
/var/log/vspark/backendWorker.log
No change
/var/log/vspark/license.log
No change
/var/log/vspark/search.log
No change
/var/log/vspark/server.log
No change
Table 4. /var/log/vocijobmgr/ Original Location
New Location
/var/log/vocijobmgr/init.err
journal
/var/log/vocijobmgr/init.out
journal
Table 5. /var/lib/vspark/managers/{company-org-folder}/logs/ Original Location
New Location
*
No change
-
The
voci-spark-hdfs-lib
package is now optional, and it is no longer installed by default, as most installations don't require HDFS support.
Fixes in V‑Spark 4.0
The following issues have been resolved in the V‑Spark 4.0 release.
V‑Spark 4.0.2 Fixes
-
Action buttons will no longer appear enabled until the required form criteria are met. Previously, Login and Upload buttons would appear to be active even though using them would cause errors.
-
Resolved an issue with error and warning logging that caused some details to be discarded, or to be spread out across multiple lines. Log messages for most system errors and warnings now include more complete and consistent detail.
V‑Spark 4.0.1 Fixes
-
Folder permissions were sometimes incorrectly set when using the API to create folders. This caused the API call to fail sporadically, and caused the folders to be hidden and unusable until the permissions were correctly set.
-
Dashboard performance when viewing individual folders has been improved. Note that only single-folder dashboard performance was affected by this issue.
-
V‑Spark services now automatically restart during hardware boot and reboot.
-
Backend V‑Spark services now properly stop when Redis is down. Prior to this change, stopping the backend service while Redis was down could lead to errors that required killing backend processes manually in order to restart them.
-
The status indicator of Folders' server status now consistently shows the red or green indicator independently of user role.
-
The call volume data display on the Overview Dashboard now behaves more consistently. Previously, page refreshes would show either the monthly view or the 31-day view at random. The dashboard now shows either the monthly or the 31-day view consistently across subsequent page loads, depending on the last selected.
-
Organizations created via the /config endpoint will now use the "US/Eastern" time zone if none is specified in the initial API call. Previously, no time zone would be assigned to the organization. This would eventually cause the UI to fail to render some pages, as certain pages expect time zone information to exist for organizations.
Known Issues in V‑Spark 4.0
-
Audio files ingested with incorrectly formatted JSON metadata files may be imported without their expected metadata fields. These files are flagged as
BAD METADATA
, but are still imported, transcribed, and analyzed, which could lead to files being imported without expected metadata. -
Application changes may not display in real time when made by another user from a different host. Although application editing works, users editing an application simultaneously from different hosts must refresh the Application Editor to see changes made by another user. This issue does not typically occur when both users are being served by the same host.