Package 'mdsOpt'

Title: Searching for Optimal MDS Procedure for Metric and Interval-Valued Data
Description: Selecting the optimal multidimensional scaling (MDS) procedure for metric data via metric MDS (ratio, interval, mspline) and nonmetric MDS (ordinal). Selecting the optimal multidimensional scaling (MDS) procedure for interval-valued data via metric MDS (ratio, interval, mspline).Selecting the optimal multidimensional scaling procedure for interval-valued data by varying all combinations of normalization and optimization methods.Selecting the optimal MDS procedure for statistical data referring to the evaluation of tourist attractiveness of Lower Silesian counties. (Borg, I., Groenen, P.J.F., Mair, P. (2013) <doi:10.1007/978-3-642-31848-1>, Walesiak, M. (2016) <doi:10.15611/ekt.2016.2.01>, Walesiak, M. (2017) <doi:10.15611/ekt.2017.3.01>).
Authors: Marek Walesiak [aut] , Andrzej Dudek [aut, cre]
Maintainer: Andrzej Dudek <[email protected]>
License: GPL (>= 2)
Version: 0.7-6
Built: 2025-02-24 04:06:28 UTC
Source: https://github.com/cran/mdsOpt

Help Index


The evaluation of tourist attractiveness of Lower Silesian counties

Description

The empirical study uses the statistical data presented in the article (Gryszel, Walesiak, 2014) and referring to the attractiveness level of 31 objects (29 Lower Silesian counties, pattern and antipattern object) The evaluation of tourist attractiveness of Lower Silesian counties was performed using 16 metric variables (measured on a ratio scale): x1 – beds in hotels per 1 km2 of a county area, x2 – number of nights spent daily by resident tourists per 1000 inhabitants of a county, x3 – number of nights spent daily by foreign tourists per 1000 inhabitants of a county, x4 – gas pollution emission in tons per 1 km2 of a county area, x5 – number of criminal offences and crimes against life and health per 1000 inhabitants of a county, x6 – number of property crimes per 1000 inhabitants of a county, x7 – number of historical buildings per 100 km2 of a county area, x8 – x9 – x10 – number of events as well as cultural and tourist ventures in a county, x11 – number of natural monuments calculated per 1 km2 of a county area, x12 – number of tourist economy entities per 1000 inhabitants of a county (natural and legal persons), x13 – expenditure of municipalities and counties on tourism, culture and national heritage protection as well as physical culture per 1 inhabitant of a county in PLN, x14 – viewers in cinemas per 1000 inhabitants of a county, x15 – museum visitors per 1000 inhabitants of a county, x16 – number of construction permits (hotels and accommodation buildings, commercial and service buildings, transport and communication buildings, civil and water engineering constructions) issued in a county in the years 2011-2012 per 1 km2 of a county area. The statistical data were collected in 2012 and come from the Local Data Bank of the Central Statistical Office of Poland, the data for x7 variable only were obtained from the regional conservation officer.

Format

data.frame: 31 objects (29 counties, pattern and antipattern object), 16 variables. The coordinates of a pattern object cover the most preferred preference variable (stimulants, destimulants, nominants) values. The coordinates of an anti-pattern object cover the least preferred preference variable values.

Source

Gryszel, P., Walesiak, M., (2014), Zastosowanie uogólnionej miary odległości GDM w ocenie atrakcyjności turystycznej powiatów Dolnego Śląska [The Application of the General Distance Measure (GDM) in the Evaluation of Lower Silesian Districts’ Attractiveness], Folia Turistica, 31, 127-147. Available at: http://www.folia-turistica.pl/attachments/article/402/FT_31_2014.pdf.

Examples

library(mdsOpt)
  metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
  metscale<-c("ratio","interval")
  metdist<-c("euclidean","GDM1")
  data(data_lower_silesian)
  res<-optSmacofSym_mMDS(data_lower_silesian,normalizations=metnor,
  distances=metdist,mdsmodels=metscale)
  print(findOptimalSmacofSym(res))

draw series of isoquants

Description

function draw series of isoquants (a contour line drawn through the set of points at which the same quantity of output is produced while changing the quantities of two or more inputs)

Usage

drawIsoquants(x,y=NULL,number=6,steps=NULL)

Arguments

x

two dimensional point (center)

y

optional - second point, used for calculations of step size if steps is null

number

number of isoquants

steps

distance between following isoquants starting from x, if length of this arguments is lower than number argument last item is repeated

Value

This is a plotting function, thus does not return any value

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Walesiak, M., (2016), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.

Walesiak, M. (2017), The application of multidimensional scaling to measure and assess changes in the level of social cohesion of the Lower Silesia region in the period 2005-2015, Ekonometria, 3(57), 9-25. Available at: doi:10.15611/ekt.2017.3.01.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

Examples

#Example 1
library(mdsOpt)
library(smacof)
library(clusterSim)
data(data_lower_silesian)
z<-data.Normalization(data_lower_silesian, type="n1")
d<-dist.GDM(z, method="GDM1")
res <- smacofSym(delta=d,ndim=2,type="interval")
print("Objects configuration", quote=FALSE)
plot(res, plot.type="confplot")
r1<-res$conf[nrow(z),1]
r2<-res$conf[nrow(z),2]
r3<-res$conf[nrow(z)-1,1]
r4<-res$conf[nrow(z)-1,2]
arrows(r1,r2,r3,r4,length=0.1,col="black")
res_up<-as.matrix(dist(res$conf,method="euclidean"))
drawIsoquants(res$conf[nrow(z)-1,],steps=max(res_up)/6)
# or 
# drawIsoquants(res$conf[nrow(z)-1,],steps=c(0.3,0.2),number=8)

#Example 2
library(mdsOpt)
library(smacof)
library(clusterSim)
data(data_lower_silesian)
z<-data.Normalization(data_lower_silesian, type="n1")
d<-dist.GDM(z, method="GDM1")
res<-smacofSym(delta=d,ndim=2,type="interval")
res1<-res$conf
#write.table(res1,"conf_2d.csv",dec=",",sep=";",col.names=NA,row.names=TRUE)
alfa<- 1.05*pi
a<- cos(alfa)
b<- -sin(alfa)
c<- sin(alfa)
d<- cos(alfa)
D<-array(c(a,b,c,d), c(2,2))
#res1<-read.csv2("conf_2d.csv", header=TRUE, row.names=1)
res1<-as.matrix(res1)
res2<-res1
plot(res2, xlab="Dimension 1",ylab="Dimension 2",main="",asp=1)
points(res2[1:31,],pch=1,font=2)
text(res2[c(1:31),],pos=3,cex=0.7,row.names(z[c(1:31),]))
r1<-res2[nrow(z),1]
r2<-res2[nrow(z),2]
r3<-res2[nrow(z)-1,1]
r4<-res2[nrow(z)-1,2]
arrows(r1,r2,r3,r4,length=0.1,col="black")
res_up<-as.matrix(dist(res2,method="euclidean"))
drawIsoquants(res2[nrow(z)-1,],steps=max(res_up)/6)

Selecting the optimal multidimensional scaling (MDS) procedure

Description

Selecting the optimal multidimensional scaling procedure - metric MDS (by varying all combinations of normalization methods, distance measures, and metric MDS models) and nonmetric MDS (by varying all combinations of normalization methods and distance measures)

Usage

findOptimalSmacofSym(table,
critical_stress=(max(as.numeric(gsub(",",".",table[,"STRESS 1"],fixed=TRUE)))+
min(as.numeric(gsub(",",".",table[,"STRESS 1"],fixed=TRUE))))/2,
critical_HHI=NA)

Arguments

table

result from optSmacofSym_nMDS or optSmacofSym_mMDS. Data frame ordered by increasing value of Stress-1 fit measure or HHI index with columns:

Normalization method

Distance measure

MDS model

Spline degree

STRESS 1

HHI spp

critical_stress

threshold value of Kruskal's Stress-1 fit measure. Default - mid-range of Kruskal's Stress-1 fit measures calculated for all MDS procedures

critical_HHI

threshold value of Hirschman-Herfindahl HHI index. Only one parameter critical_stress or critical_HHI can be set, and the function finds the optimal value among the procedures for which the selected measure is lower or equal treshold value

Value

Nr

number of row in table with optimal multidimensional scaling procedure

Normalization_method

normalization method used for optimal multidimensional scaling procedure

MDS_model

MDS model used for optimal multidimensional scaling procedure

Spline_degree

Additional spline.degree value for optimal procedure, if mspline model is used for simulation. For other models there is no value for this field

Distance_measure

distance measure used for optimal multidimensional scaling procedure

STRESS_1

value of Kruskal Stress-1 fit measure for optimal multidimensional scaling procedure

HHI_spp

Hirschman-Herfindahl HHI index, calculated based on stress per point, for optimal multidimensional scaling procedure

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.

Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.

De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.

Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.

Hirschman, A.O. (1964). The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.

Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.

Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.

Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

See Also

data.Normalization, dist.GDM, dist, smacofSym

Examples

library(mdsOpt)
  metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
  metscale<-c("ratio","interval")
  metdist<-c("euclidean","manhattan","maximum","seuclidean","GDM1")
  data(data_lower_silesian)
  res<-optSmacofSym_mMDS(data_lower_silesian,normalizations=metnor,
  distances=metdist,mdsmodels=metscale,outDec=".")
  print(findOptimalSmacofSym(res))

Selecting the optimal multidimensional scaling procedure - metric MDS

Description

Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods, distance measures, and metric MDS models

Usage

optSmacofSym_mMDS(x,normalizations=NULL,distances=NULL,
mdsmodels=NULL,weights=NULL,spline.degrees=c(2),
outputCsv="",outputCsv2="",outDec=",",
stressDigits=6,HHIDigits=2,...)

Arguments

x

matrix or dataset

normalizations

optional, vector of normalization methods that should be used in procedure

distances

optional, vector of distance measures (manhattan, Euclidean, Chebyshew, squared Euclidean, GDM1) that should be used in procedure

mdsmodels

optional, vector of multidimensional models (ratio, interval, mspline) that should be used in procedure

spline.degrees

optional, vector (e.g. 2:4) of spline.degree parameter values that should be used in procedure for mspline model

weights

optional, variable weights used in distance calculation. Each weight takes value from interval [0; 1] and sum of weights equals one

outputCsv

optional, name of csv file with results

outputCsv2

optional, name of csv (comma as decimal point sign) file with results

outDec

decimal sign used in returned table

stressDigits

Number of decimal digits for displaying Stress 1 value

HHIDigits

Number of decimal digits for displaying HHI spp value

...

arguments passed to smacofSym, like ndim, itmax, eps and others

Details

Parameter normalizations may be the subset of the following values:

"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",

"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"

(e.g. normalizations=c("n1","n2","n3","n5","n5a",

"n8","n9","n9a","n11","n12a"))

if normalizations is set to "n0" no normalization is applied

Parameter distances may be the subset of the following values:

"euclidean","manhattan","maximum","seuclidean","GDM1"

(e.g. distances=c("euclidean","manhattan"))

Parameter mdsmodels may be the subset of the following values (metric MDS):

"ratio","interval","mspline" (e.g. c("ratio","interval"))

Value

Data frame ordered by increasing value of Stress-1 fit measure with columns:

Normalization method

normalization method used for p-th multidimensional scaling procedure

MDS model

MDS model used for p-th multidimensional scaling procedure

Spline degree

Additional spline.degree value if mspline model is used for simulation, for other models there is no value in this cell

Distance measure

distance measure used for p-th multidimensional scaling procedure

STRESS 1

value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure

HHI spp

Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.

Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.

De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.

Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.

Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.

Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.

Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.

Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

See Also

data.Normalization, dist.GDM, dist, smacofSym

Examples

library(mdsOpt)
  metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
  metscale<-c("ratio","interval","mspline")
  metdist<-c("euclidean","manhattan","seuclidean","maximum","GDM1")
  data(data_lower_silesian)
  res<-optSmacofSym_mMDS(data_lower_silesian,,normalizations=metnor,distances=metdist,
    mdsmodels=metscale, spline.degrees=c(2:3),outDec=".")
  stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
  hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
  cs<-(min(stress)+max(stress))/2 # critical stress
  t<-findOptimalSmacofSym(res,critical_stress=cs)
  print(t)
  plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
  text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
  abline(v=cs,col="red")
  points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
  text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")

Selecting the optimal multidimensional scaling procedure - nonmetric MDS

Description

Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods and distance measures

Usage

optSmacofSym_nMDS(x,normalizations=NULL,distances=NULL,
mdsmodels=c("ordinal"),weights=NULL,
outputCsv="",outputCsv2="",outDec=",",
stressDigits=6,HHIDigits=2,...)

Arguments

x

matrix or dataset

normalizations

optional, vector of normalization methods that should be used in procedure

distances

optional, vector of distance measures (manhattan, Euclidean, Chebyshew, squared Euclidean, GDM1) that should be used in procedure

mdsmodels

"ordinal" (nonmetric MDS)

weights

optional, variable weights used in distance calculation. Each weight takes value from interval [0; 1] and sum of weights equals one

outputCsv

optional, name of csv file with results

outputCsv2

optional, name of csv (comma as decimal point sign) file with results

outDec

decimal sign used in returned table

stressDigits

Number of decimal digits for displaying Stress 1 value

HHIDigits

Number of decimal digits for displaying HHI spp value

...

arguments passed to smacofSym

Details

Parameter normalizations may be the subset of the following values:

"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",

"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"

(e.g. normalizations=c("n1","n2","n3","n5","n5a",

"n8","n9","n9a","n11","n12a"))

if normalizations is set to "n0" no normalization is applied

Parameter distances may be the subset of the following values:

"euclidean", "manhattan","maximum","seuclidean","GDM1"

(e.g. distances=c("euclidean","manhattan"))

Parameter mdsmodels "ordinal" MDS model (nonmetric MDS)

Value

Data frame ordered by increasing value of Stress-1 fit measure with columns:

Normalization method

normalization method used for p-th multidimensional scaling procedure

MDS model

"ordinal" MDS model (nonmetric MDS) for p-th multidimensional scaling procedure

Distance measure

distance measure used for p-th multidimensional scaling procedure

STRESS 1

value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure

HHI spp

Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.

Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.

De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.

Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.

Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.

Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.

Walesiak, M. (2016a), Wybór grup metod normalizacji wartości zmiennych w skalowaniu wielowymiarowym [The Choice of Groups of Variable Normalization Methods in Multidimensional Scaling], Przegląd Statystyczny, tom 63, z. 1, 7-18. Available at: doi:10.5604/01.3001.0014.1145.

Walesiak, M. (2016b), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

See Also

data.Normalization, dist.GDM, dist, smacofSym

Examples

library(mdsOpt)
  metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
  metscale<-"ordinal"
  metdist<-c("euclidean","manhattan","maximum","seuclidean","GDM1")
  data(data_lower_silesian)
  res<-optSmacofSym_nMDS(data_lower_silesian,normalizations=metnor,
  distances=metdist,mdsmodels=metscale)
  stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
  hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
  cs<-(min(stress)+max(stress))/2 # critical stress
  t<-findOptimalSmacofSym(res,critical_stress=cs)
  print(t)
  plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
  text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
  abline(v=cs,col="red")
  points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
  text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")

Selecting the optimal multidimensional scaling procedure for interval-valued data

Description

Selecting the optimal multidimensional scaling procedure by varying all combinations of normalization methods, distance measures for interval-valued data, and metric MDS models/

Usage

optSmacofSymInterval(x,dataType="simple",normalizations=NULL,
distances=NULL,mdsmodels=NULL,spline.degrees=c(2),outputCsv="",
outputCsv2="",y=NULL,outDec=",",
stressDigits=6,HHIDigits=2,...)

Arguments

x

interval-valued data table or matrix or dataset

dataType

Type of symbolic data table passed to function:

'sda' - full symbolicDA format object;

'simple' - three dimensional array with lower and upper bound of intervals in third dimension;

'separate_tables' - lower bound of intervals in x, upper bound of intervals in y;

'rows' - lower and upper bound of intervals in neighbouring rows;

'columns' - lower and upper bound of intervals in neighbouring columns

normalizations

optional, vector of normalization methods that should be used in procedure

distances

optional, vector of distance measures (Hausdorf, Ichino-Yaguchi) that should be used in procedure

mdsmodels

optional, vector of multidimensional models (ratio, interval, mspline) that should be used in procedure

spline.degrees

optional, vector (e.g. 2:4) of spline.degree parameter values that should be used in procedure for mspline model

outputCsv

optional, name of csv file with results

outputCsv2

optional, name of csv (comma as decimal point sign) file with results

y

matrix or dataset with upper bounds of intervals if argument dataType is equal to "separate_tables"

outDec

decimal sign used in returned table

stressDigits

Number of decimal digits for displaying Stress 1 value

HHIDigits

Number of decimal digits for displaying HHI spp value

...

arguments passed to smacofSym, like ndim, itmax, eps and others

Details

Parameter normalizations may be the subset of the following values:

"n1","n2","n3","n3a","n4","n5","n5a","n6","n6a",

"n7","n8","n9","n9a","n10","n11","n12","n12a","n13"

(e.g. normalizations=c("n1","n2","n3","n5","n5a",

"n8","n9","n9a","n11","n12a"))

if normalizations is set to "n0" no normalization is applied

Parameter distances may be the subset of the following values:

"H_q1","H_q2","U_2_q1","U_2_q2" (In following order: Hausdorff distance with q=1, Euclidean Hausdorff distance with q=2, Ichino-Yaguchi distance with q=1; Euclidean Ichino-Yaguchi distance with q=2)

(e.g. distances=c("H_q1","U_2_q1"))

Parameter mdsmodels may be the subset of the following values (metric MDS):

"ratio","interval","mspline" (e.g. c("ratio","interval"))

Value

Data frame ordered by increasing value of Stress-1 fit measure with columns:

Normalization method

normalization method used for p-th multidimensional scaling procedure

MDS model

MDS model used for p-th multidimensional scaling procedure

Spline degree

Additional spline.degree value if mspline model is used for simulation, for other models there is no value in this cell

Distance measure

distance measures for interval-valued data used for p-th multidimensional scaling procedure

STRESS 1

value of Kruskal Stress-1 fit measure for p-th multidimensional scaling procedure

HHI spp

Hirschman-Herfindahl HHI index calculated based on stress per point for p-th multidimensional scaling procedure

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Borg, I., Groenen, P.J.F. (2005), Modern Multidimensional Scaling. Theory and Applications, 2nd Edition, Springer Science+Business Media, New York. ISBN: 978-0387-25150-9. Available at: https://link.springer.com/book/10.1007/0-387-28981-X.

Borg, I., Groenen, P.J.F., Mair, P. (2013), Applied Multidimensional Scaling, Springer, Heidelberg, New York, Dordrecht, London. Available at: doi:10.1007/978-3-642-31848-1.

De Leeuw, J., Mair, P. (2015), Shepard Diagram, Wiley StatsRef: Statistics Reference Online, John Wiley & Sons Ltd.

Dudek, A., Walesiak, M. (2020), The Choice of Variable Normalization Method in Cluster Analysis, pp. 325-340, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

Herfindahl, O.C. (1950), Concentration in the Steel Industry, Doctoral thesis, Columbia University.

Hirschman, A.O. (1964), The Paternity of an Index, The American Economic Review, Vol. 54, 761-762.

Walesiak, M. (2014), Przegląd formuł normalizacji wartości zmiennych oraz ich własności w statystycznej analizie wielowymiarowej [Data Normalization in Multivariate Data Analysis. An Overview and Properties], Przegląd Statystyczny, tom 61, z. 4, 363-372. Available at: doi:10.5604/01.3001.0016.1740.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

Walesiak, M., Dudek, A. (2020), Searching for an Optimal MDS Procedure for Metric and Interval-Valued Data using mdsOpt R package, pp. 307-324, [In:] K. S. Soliman (Ed.), Education Excellence and Innovation Management: A 2025 Vision to Sustain Economic Development during Global Challenges, Proceedings of the 35th International Business Information Management Association Conference (IBIMA), 1-2 April 2020, Seville, Spain. ISBN: 978-0-9998551-4-1.

See Also

data.Normalization, interval_normalization, dist.Symbolic, smacofSym

Examples

library(mdsOpt)
 library(clusterSim)
 data(data_symbolic_interval_polish_voivodships)
 metnor<-c("n1","n2","n3","n5","n5a","n8","n9","n9a","n11","n12a")
 metscale<-c("ratio","interval","mspline")
 metdist<-c("H_q1","H_q2","U_2_q1","U_2_q2")
 res<-optSmacofSymInterval(data_symbolic_interval_polish_voivodships,dataType="simple",
 normalizations=metnor,distances=metdist,mdsmodels=metscale,spline.degrees=c(2,3),outDec=".")
 stress<-as.numeric(gsub(",",".",res[,"STRESS 1"],fixed=TRUE))
 hhi<-as.numeric(gsub(",",".",res[,"HHI spp"],fixed=TRUE))
 t<-findOptimalSmacofSym(res)
 cs<-(min(stress)+max(stress))/2 # critical stress
 plot(stress[-t$Nr],hhi[-t$Nr], xlab="Stress-1", ylab="HHI",type="n",font.lab=3)
 text(stress[-t$Nr],hhi[-t$Nr],labels=(1:nrow(res))[-t$Nr])
 abline(v=cs,col="red")
 points(stress[t$Nr],hhi[t$Nr], cex=5,col="red")
 text(stress[t$Nr],hhi[t$Nr],labels=(1:nrow(res))[t$Nr],col="red")
 print(t)

Cretaes video by FFmpeg with animation of dataset rotated

Description

This function opens a graphics device to record the images produced in the code expr, then uses FFmpeg to convert these images to a video.

Usage

rotation2dAnimation(conf2d,
ani.interval=0.2,
ani.nmax=361,
ani.width=500,
ani.height=500,
ani.video.name="mds_rotate.mp4",
angle.start=-pi,
angle.stop=pi,
angle.step=pi/180)

Arguments

conf2d

two dimensional dataset ot matrix

ani.video.name

the file name of the output video (e.g. ‘animation.mp4’ or ‘animation.avi’)

ani.interval

interval betwwen animation frames

ani.nmax

maximal number of frames

ani.width

width of movie

ani.height

height of movie

angle.start

starting angle for animation

angle.stop

end angle for animation

angle.step

step of animation in radians

Details

This function uses system to call FFmpeg to convert the images to a single video. The command line used in this function is: ffmpeg -y -r <1/interval> -i <img.name>%d.<ani.type> other.opts video.name

where interval comes from ani.options('interval'), and ani.type is from ani.options('ani.type'). For more details on the numerous options of FFmpeg, please see the reference.

Some linux systems may use the alternate software 'avconv' instead of 'ffmpeg'. The package will attempt to determine which command is present and set ani.options('ffmpeg') to an appropriate default value. This can be overridden by passing in the ffmpeg argument.

Value

An integer indicating failure (-1) or success (0) of the converting (refer to system).

Author(s)

Marek Walesiak [email protected], Andrzej Dudek [email protected]

Department of Econometrics and Computer Science, Wroclaw University of Economics and Business, Poland

References

Walesiak, M. (2016), Visualization of Linear Ordering Results for Metric Data with the Application of Multidimensional Scaling, Ekonometria, 2(52), 9-21. Available at: doi:10.15611/ekt.2016.2.01.

Walesiak, M. (2017), The application of multidimensional scaling to measure and assess changes in the level of social cohesion of the Lower Silesia region in the period 2005-2015, Ekonometria, 3(57), 9-25. Available at: doi:10.15611/ekt.2017.3.01.

Walesiak, M., Dudek, A. (2017), Selecting the Optimal Multidimensional Scaling Procedure for Metric Data with R Environment, STATISTICS IN TRANSITION new series, September, Vol. 18, No. 3, pp. 521-540.

https://yihui.org/animation/example/savevideo/

http://ffmpeg.org/documentation.html

See Also

Other utilities: im.convert, saveGIF, saveHTML, saveLatex, saveSWF

Examples

library(mdsOpt)
    library(smacof)
    library(animation)
    library(spdep)
    library(clusterSim)
    data(data_lower_silesian)
    z<-data.Normalization(data_lower_silesian, type="n1")
    d<-dist.GDM(z, method="GDM1")
    res<-smacofSym(delta=d,ndim=2,type="interval")
    konf<-as.matrix(res$conf)
    #Uncomment only if ffmpeg is properly installed for animation package 
    #see:  https://yihui.org/animation/example/savevideo/ 
    #oopts = if (.Platform$OS.type == "windows") {
    # ani.options(ffmpeg = "D:/Installer/ffmpeg/bin/ffmpeg.exe")
    #}
    #rotation2dAnimation(conf2d=konf,angle.start=-0,angle.stop=2*pi)