Social Computing Data Analysis Project

Data Analysis Software : Gephi

Data collection : We Collect Data from FaceBook by using netvizz (Depth parameter- 2).

Data Analysis: We use page like network module to analyze the data network of BCB (Bangladesh Cricket Board) official facebook page Bangladesh cricket-the tiger’s.

We use page like network module to analyze the data network of BC Facebook page which is being connected through like by others cricket pages. In the data, each pages from cricket world is identified as a node. There are 1769 edges along with 235 nodes represents a directed graph and shows that 235 cricket pages interconnected through like network in between them. Eventually data predicts the most 4 popular Facebook pages that signifies clustered. We have used layout algorithms by setting repulsion strength on 3000 in order to make an aesthetically pleasing representation of our data analytics.

Explore The Data-set according to our Analysis :

  1. Calculate some overall network measures (like density, clustering, betweenness centrality, degree centrality). What do these measures tell you about the network?
  • Density: from the density graph we found 0.032 density of the network which is quiet low to describe its network connection with all cricket pages, since most significant pages have highly connected nodes on graph. Such as Cricket South
  • Clustering: for directed graph we use directed mode to find how nodes are embedded in their neighborhood. The average clustering co efficient give an overall indication of the clustering in the network.
  • Average Clustering Coefficient: 0.394

The Average Clustering Coefficient is the mean value of individual coefficients.

  • Betweenness centrality: hence most common pages are liked very often, so their betweenness centrality should be high. The first place holds the most popular liked page in social media in cricket world is cricket .com.au has a 15627.942264 betweenness centrality where Bangladesh cricket has comparatively very low on 2348.902423. This means, Cricket.com.au nodes are more likely to be in communication paths between other nodes. Moreover, this node determines the bigger interconnection between nodes that would be cut off in case this node would disappear.
  • Degree centrality: the average degree for this graph is 7.528.

The popular facebook pages of cricket has growing number of degree centrality in context of page like networks. From the dataset the big node like cricket.com.au, has highest in degree and out degree where Cricketsouthafrica is in second place and ICC is third and consecutively followed by smaller degree. This says that, cricket.com.au has influence in connecting nodes of degree and being a central nodes it could have spread information and influence others in his immediate neighborhood.

cricket.com.au degree of centrality

  1. Find the nodes with the highest (in/out) degree, betweenness centrality and Eigenvector centrality. Are there differences in how the nodes rank in these measures? Why is there a difference?
  • In/out degree:cricket.com.au: 87/84
  • ICC – International Cricket Council: 48/47
  • Cricket South Africa: 39/56
  • Australian Men’s Cricket Team: 40/53
  • Eigenvector centrality: this is similar to Google ranks web pages: links from highly linked –to pages count more. It determines the nodes are connected to the most connected nodes.
  • cricket.com.au: 0.049856
  • ICC – International Cricket Council: 0.028939
  • Cricket South Africa: 0.022776
  • Australian Men’s Cricket Team: 0.021195
  1. Perform a cluster analysis to identify some highly connected clusters (“Modularity” in Gephi). Can you interpret the clusters that are identified?

Clustered data shows 7 main clusters that are connected. The degree range filter is applied, setting the minimum to 2 connections. We used the Force Atlas algorithm to display the four clusters more evidently. We used the “no overlap” algorithm and removed the edges to make the graph more readable. While analyzing the data, we got many insights like the most popular and active cricket boards in ICC. In the graph above the most popular board are closer to the center of the circle. In a complete graph, all Clustered are connected to each other to the center of the Clustered. In analysis the Gephi Modularity we have found 4 major clusters.

  • Cricket.com.au
  • Cricket South Africa
  • ICC – International Cricket Council
  • Blackcaps

1

Cricket.com.au is one of the popular amongst these clusters basis on the post activities, fan counts and talking about. The statistics shows that it is very popular among the fans in Cricket World. We have found that the ICC and the local cricketing authorities have a lot of interests in this game across the country and there are a lot of activities are always being going on in the country throughout the year. If we talk about other clusters they are also putting many efforts in the emergence of this game by playing a lot of leagues, championships and first class activities. All these clusters are performing well and according to the fan following across the World they are growing stronger day by day.

  1. Find a good way to visualize the network by trying different Layout algorithms, different node sizes, colours etc. Take a screenshot and post the picture in your weblog. Explain the most important insights you gained in a few words.

Gephi is software for quickly and easily building network visualization applications that turn data into insight. The Gephi site includes everything what you need to build by your own. For good way to visualize the network we’ve been working on different layout algorithms, node sizes and colors to make our visualization more useful.

Layout Algorithm: The layout algorithms are probably the most important tool to make your data more logical. We try several of them with various settings and come back to them until they fulfill the needs of the current data set. We would like to start with Force Atlas in many cases it gives good results. Once there isn’t much movement going on anymore, we make it stop.

Color Nodes :

The nodes usually represent the main entities in data. This will help us see the cricket board’s connection to the center of the Board.

  • The purple Color Nodes looks seems to be about The International Cricket Council (ICC).
  • The green one seems to be South Africa cricket board.
  • The blue ones appear to me as Cricket.com.au.

We get much more out of visualization if we don’t have to look up everything and can interpret them.

2

Set node size by in-degree:

We want to know who is the most important nodes in this specific data set. Therefore we go to Appearance -> Nodes -> Size -> Attribute -> In-Degree and limit the size from 1 to 100 after testing around which size works best for this graph. After that we turn on labels and set their size to node size.

Links:

These are the connections between data entities. We use attributes like weight, color and style to communicate different kinds and strengths of connection.

  1. Find at least one practical question for your dataset that you can answer with the data you have obtained (e.g. which users are similar? who can I trust most? what should the system recommend in a particular situation? which are the most important nodes?

Since this data extracted from the BCB Facebook page, it has identified the potential nodes of this network which are rarely or most often connected in social media. During the analysis we have found that all the users visited these Facebook pages belong to the cricket world that configured color and size by their rank as degree, however they all are similar to each other because all fans visits those pages for getting information related to Cricket. The fans we trust the most are those which frequently visits these pages and having interaction regularly. We have already seen in the graph above that which one is the strongest page on the social media basis of these regularly visited fans. We can tell a lot just from this one image. When we turn on labels, we could see which page each circle represents. The color indicates their grouping, and the circle size shows the comparative strength of the page.

The further out these dots are, the less internally linked the pages are. We can guess by the number of nodes of each color and which sports category has created the most content for their fans. Moreover what are the successful points for them to attract the fans from the external links/resources? For example, we can see a lot of green dots around South Africa Cricket cluster which indicates that it is likely an important practice area for the sports and they are creating a lot of content around it for their fans around the globe.

https://ifi7167socialcomputing.wordpress.com/data-analysis/

Assignment 5: Data Analysis Reading

What were the assumptions of the model and the data analysis the authors used?

From reading this paper, we known that the authors analyze the adoption and abandonment dynamics of OSNs by drawing analogy to the dynamics that preside over the increase of transferable disease. Determine the decrease in data of social network in SSE is achieved by modifying the traditional SIR model to include transferable recovery dynamics, which is a better description of OSN dynamics although doubtful validity for modeling the abandonment of OSNs. have validate the irSIR model of OSN dynamics on Google data for search query “MySpace”. In addition the authors presume that web traffic is the general metric to determine the data of social network usage.

  • What were the assumptions of the model and the data analysis the authors used?

From reading this paper, we known that the authors analyze the adoption and abandonment dynamics of OSNs by drawing analogy to the dynamics that preside over the increase of transferable disease. Determine the decrease in data of social network in SSE is achieved by modifying the traditional SIR model to include transferable recovery dynamics, which is a better description of OSN dynamics although doubtful validity for modeling the abandonment of OSNs. have validate the irSIR model of OSN dynamics on Google data for search query “MySpace”. In addition the authors presume that web traffic is the general metric to determine the data of social network usage.

  • Find three arguments that could call the authors’ conclusions into question.
  1. The utilization of Social networks isn’t totally similar to diseases, despite the fact that both develop rapidly. OSNs give advantages and individuals attempt to recoup from.
  2. Userbase and searches are not associated to Google Trends data. Nowadays People use Facebook on their smart devices (tab, phone) without visiting Google.
  3. The research showed that facebook will lose it 80 % of users between 2015 and 2017 but does that happen? as it turns to be false as we already saw that time period and even facebook audience count increase in that time.
  • Check the statistics of the keyword “facebook” in Google Trends. Compare it with the prediction prpaper, as wel­­l as with data search for Twitter, Youtube and Whatsapp. Comment the results!

The “Google Trends” shows that all facebok and youtube are are declined trend compare with previous years at the same time as Twitter, and Whatsapp are approximately level.

Capture

Figure: The Comparison between the four (Facebook, Twitter, Youtube and Whatsapp) OSNs since 2004-present.

https://ifi7167socialcomputing.wordpress.com/data-analysis/

Assignment 4: Echo chambers, filter bubbles, opinion mining and opinion manipulation

Blogs Are Echo Chambers: Blogs Are Echo Chambers
Eric Gilbert, Tony Bergstrom and Karrie Karahalios

1.How agreement and disagreement were measured in this paper?

This paper presents an empirical analysis of blog comments from 33 of the world’s best blogs. The agreement clearly exceeds the disagreement when the commentator takes a position on a blogger’s post. So the authors have classified the comments of the best 33 blogs as indexed by “Technorati for echo chambers.” Classified into three main approaches: agree, disagree and neither. Machine learning and NLP techniques are used to extract useful information from the text to make decisions about the agreement and disagreement. To do this, they wrote custom text analysis code (in Java and Perl) and used the Weka toolkit.

2.From your personal experience, give some examples of online communities, where there is very little disagreement between opinions (or disagreement is not tolerated).

Nowadays. Facebook group, Facebook fan pages, whatsapp group, imo group, twitter, LinkedIn are strong online communities where people share their thoughts.

In my personal experience, I saw promoting movies/songs/drama in Facebook group. People are constantly exposed to different ideas. There is no way you can agree with every person who crosses your path. In some cases, you may disagree with their opinion. You can discuss your point respectfully and civilly. Despite that group (Facebook group, Facebook fan page, whatsapp group, imo group, twitter, LinkedIn) a bit ‘of time different from your opinion, has abandoned you, removed from the group. Disagreement does not always unpleasant,in some cases, it can also be educational but they not tolerated.

3.What features of Facebook wall (or Facebook in general), promote development of echo chambers?

Facebook has some great features like life event. Like-dislikes, hastag, memory, you may know, note, birthday. They continuously remained you by notification may be you totally forgot about this. So those features of Facebook developed of echo chamber by engaged with them.

Public Discourse in the Age of Personalization: Psychological Explanations and Political Implications of Search Engine Bias and the Filter Bubble
Audrey B. Carson

1.Please propose an alternative design for Google search that addresses problems posed in this paper. By design I mean here just a short description of user interactions, information filtering and presentation of results.

When we search something on Google, Google suggest us a lot of things at the same time like image, article, video, web and we feel travel to find out the right things. So my suggestion is that we will use two different tabs at a same time. One is specific and another one is mixture of all those things.

2.Considering that the same problems apply to Facebook, is t possible to apply your design to Facebook too?

Currently, facebook looked so messy comparing with past. We even can’t find our friends post such as status and pictures when we didn’t check on time. In my opinion, every facebook account will have two walls, one is for friend’s specific post like status, picture, notes and another one is for news, commercial ads, fan page and group post.

 

https://ifi7167socialcomputing.wordpress.com/platforms-and-paradigms/echo-chambers-filter-bubbles-opinion-mining-and-opinion-manipulation/

Assignment 3: Lifestyle, Mobility and Location

 

The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City

1.What patterns are used by the authors to define a livehood cluster?

Livehoods help us reconceptualize the elements of a city in view of the online life its kin generate. Authors demonstrate the Livehood mappings not only present known divisions but they also reveal subtle changes in local social patterns and the effects they have on the character of the city. The authors to define a livehood cluster patterns are based on check-in data that was gathered from foursquare. Foursquare is a location-based online social network that was founded in 2009 that provides users a way to share their location with their friends by “checkingin” to the places they visit. The authors developed an interview protocol that explored the similarities and differences between our clusters and the official municipal neighborhood boundaries.  The authors focused on three dispersion patterns that explore the intersection between Livehoods and municipal borders:

(1) split – when a municipal neighborhood contains more than one Livehood,

(2) spilled– when a Livehood cluster spills over the boundaries of a municipal border, and

(3) corresponding – when the Livehood cluster and the municipal borders coincide.

They use these patterns to identify points of interest to explore in our interviews, assuming that different patterns reflect different local dynamics.

2.What could be some of the benefits, for different stakeholders, of using social media data to understand the structure of a city?

The social platforms mentioned above are each known as an LBSN (Location Based Social Network). Each provides a means for users to include location information in their posts and conversations People who use these apps are continually broadcasting information about where they are and what they are doing. This gives city planners data on where people are gathering and what resources they are using in any given time. Based on the aggregated and analyzed check-ins, researchers could categorize land used f0r business, daytime leisure, nightlife, and residential activities.

This information can be used to draw intelligent conclusions on what services may be needed in given areas within a city at a given time. For example, a series of blocks with a large number of geolocated check-ins on Foursquare and Urban spoon in the evenings would likely be highlighted as a high nightlife use area.

Urban planners can then use this information to improve existing services or plan for new services in the area. This can include the placement and planning of police patrol routes, police substations, fire and ambulance services, and public transportation.

The data collected from LBSN networks can be used in combination with other data as well. Publicly available data from police and emergency service records, and even traffic data can help planners understand if an area is struggling with crime, crowding, traffic flow issues or natural disasters which may affect supply chain management or peaceful urban living. Data such as complaints collected from residents about usage issues is also important.

Tweets from Justin Bieber’s Heart: The Dynamics of the Location Field in User Profiles

  1. How did the authors identify a user’s location indirectly?

The authors found that 34% of users did not provide real location data, frequently incorporating fake locations or sarcastic comments that can fool traditional geographic information tools.

User’s location:

  • The authors developed considered users’ implicit location sharing behavior. Observing only a user’s tweets and leveraging simple machine learning techniques to guess at users’ locations.
  • The method was designed to analyze the geographic information entered by users as well as the scale of any real geographic information.
  • To classify user locations, the authors developed a Multinomial Naïve Bayes (MNB) model & used algorithms to analyze the text content of users’ tweets.
  1. What can be some of the ethical and privacy implications of discovering user’s location indirectly in a social network?

Many conflicting issues contribute to the complexity of good ethical practice in social network research. However, I believe that “it’s public data, people know that when they sign up and give their permission to access their location so it is ethically right because the purpose is not to exploit someone’s privacy, it’s just to facilitate the user in some way.

The Shortest Path to Happiness: Recommending Beautiful, Quiet, and Happy Routes in the City

  1. Describe a similar context where big data from social networks could be analyzed and used to facilitate or improve the livelihood of a place.

Big data from social networks could be used to help integrated, analyzed, and its give people a deeper understanding of the status of City operations and help them make more informative. For example, medical records are categorized by hospital or region, while medical images are categorized in terms of individual medical devices and hospitals. Meanwhile, health data can be categorized in terms of individuals, hospitalized patients, communities, or health and anti-epidemic authorities.

In addition, big data from social networks must be applied using a new data processing technology which describes the real-time status of various city elements, including buildings, streets, pipelines, environments, enterprises, finance, products, markets, logistics, medicine, culture, education, traffic, public order, and population.

https://ifi7167socialcomputing.wordpress.com/platforms-and-paradigms/lifestyle-mobility-location/

Assignment 2: Case Analysis Project

A Study of LinkedIn for professional networking and career advancement

Untitled6

What Is Professional Networking?

“Networking is a deliberate activity to build, reinforce and maintain relationships of trust with other people to further your goals. Professional networking is simply networking focused on professional goals.” – Andrew Hennigan, Networking speaker, trainer, coach. Author of “Pay forward Networking”.
The result of the professional network can be:

  1. Job offers
  2. Additional sales
  3. Access to talent for recruitment
  4. Offers as a speaker, trainer, etc. For a fee or free as a further promotion
  5. Insights on how things “really work” in other organizations or business sectors
  6. Increased social status and access to social events inside and outside the corporate sphere.

A very important part of professional networking is that it is about building relationships and trust, not just exchanging business cards. For networking to succeed, you need to build a real relationship.

Some reasons to become a member of professional networking site:

  1. To access information and tacit knowledge
  2.  To be perceived as the “top-of-mind” expert in a field of activity
  3. Establish new professional relationships and strengthen existing relationships.
  4.  To increase the trust of others in you and your trust in others.
  5.  Stay well. We are honest, this is also an important reason
  6.  To make sure other people know who you are, what you do, how you do it and what you want to achieve by doing it
  7.  Self-promotion, affirmation of oneself as an authority
  8.   Networking is about real relationships and true trust, something that is very difficult to establish only online

LinkedIn: A Social-PROFESSIONAL Networking Site

Untitled3

LinkedIn is one of the most trusted online sources for job opportunities and natural communities are companies, professional societies, and industry associations.

In other words, LinkedIn is a social network for professionals. Whether you are a marketing manager in a large company, an entrepreneur who runs a small local store or even a first year university student who wishes to explore future career options, LinkedIn is for anyone who is interested in taking the their professional life looking for new opportunities to grow their careers and connect with other professionals.

You can think of LinkedIn as the high-tech equivalent of going to a traditional networking event where you go and meet other professionals in person, talk a little about what you do and exchange business cards. But on LinkedIn, add “connections” similar to how you would make a friend request on Facebook, converse via private message (or contact information available) and you have all your professional experience and results presented in a well-organized profile to show to other users.

Signing Up to Join LinkedIn:

Untitled

LinkedIn is the world’s largest professional network with hundreds of millions of members, and we are growing rapidly. Our mission is to connect the world’s professionals to make them more productive and successful.

To join LinkedIn and create your profile:

  1. Navigate to the LinkedIn sign up page.
  2. Type your first and last name, email address, and a password you’ll use.
  • Click Join now.
  • Complete any additional steps as prompted.

.

LinkedIn’s Features:

Untitled2

Here are some of the basic features offered by this company network and how they have been designed to be used by professionals.

Home: Once you’ve logged in to LinkedIn, the main feed is your news feed, which shows you the recent posts of your connections with other professionals and pages of the company you’re following.

Profile: your profile shows your name, your photo, your position, your occupation and more at the top. Below, you have the ability to customize various sections such as a brief summary, work experience, training and other sections similar to how you could create a traditional resume or CV.

My network: here you will find a list of all the professionals you are currently connected to on LinkedIn. If you hover over this option in the top menu, you will also see a number of other options that will allow you to add contacts, find people you might know and find alumni.

Jobs: All types of job listings are posted on LinkedIn every day by employers and LinkedIn will advise you on specific jobs based on your current information, including your location and optional job preferences that you can fill in to get listings more personalized work. Interests: in addition to your connections with professionals, you can also follow certain interests on LinkedIn. These include business pages, groups based on location or interest, LinkedIn’s SlideShare platform for publishing presentations, and LinkedIn’s Lynda platform for educational purposes.

Search bar: LinkedIn has a powerful search function that lets you filter results based on different customizable fields. Click on “Advanced” next to the search bar to find professionals, companies, jobs and more.

Messages: when you want to start a conversation with another professional, you can do so by sending them a private message via LinkedIn. You can also add attachments, include photos and more.

Notifications: Like other social networks, LinkedIn has a notification feature that lets you know when you’ve been approved by someone, invited to join something or welcome a post that may interest you.

Pending invitations: When other professionals invite you to connect with them on LinkedIn, you’ll receive an invitation you’ll need to approve.These are the main features that you will notice for the first time when you access LinkedIn, but you can delve into some of the more specialized details and options by exploring the platform yourself. In the end, you might be interested in using LinkedIn’s corporate services, which allow users to post jobs, leverage talent solutions, advertise on the platform, and expand your sales strategy to include social sales on LinkedIn.


Why LinkedIn:
LinkedIn provides exactly what its name implies, a way to be connected to the most influential people, groups and areas of study. As the largest and most influential social network in the world, LinkedIn allows you to reach out to business people and connect with all kinds of professionals around the world. The network allows you to establish thought leadership roles, build vital relationships, generate valuable leads, acquire information, conduct market research, maintain and improve your online reputation and create online communities.

LinkedIn Vs. Monster

LinkedIn and Monster are both popular for business professionals. While the two share many common features, such as allowing users to post resumes and search for jobs, there are key differences. LinkedIn primarily acts as a social networking website for professionals. Monster is more of a job search engine.

  • LinkedIn: Importing Information from Your Resume to Build Your Profile
  • Monster: Create a Resume on Monster
  • LinkedIn: Advanced Job Search
  • Monster: Monster’s Career Advice Forums
  • LinkedIn: Subscription Plans
  • Monster: Career Services

.

https://ifi7167socialcomputing.wordpress.com/2016/04/14/assignment-2-pick-a-case-analysis-project/

Assignment 1: Social Networks, Crowdsourcing & Resource Sharing

Tie strength in question answer on social network sites

1.How do the authors define strong and weak ties, and how did they measure the strength of a tie in Social Network Sites?

Strong and weak bonds are both relevant and important in interactions with social networks. They perform different functions in relationships but can extend the network well beyond the normal range. Using and maintaining socially weak bonds can bring far-reaching benefits outside of normal relationships.

Measuring tie strength:
 Days since last communication
 Days since first communication
 Words exchanged
 Mean ties strength of mutual friends
 Positive emotion words
 Intimacy words

2.What could be the reasons that more helpful answers come from weak ties? What could be the reasons that more helpful answers come from strong ties?

Strong ties will probably go through great length and spend a lot of time to help you. But since you only have few strong ties, they might not be able to get helpful answers. In contrast, your acquaintances (weak ties) will probably spend only a few minutes to forward your question to their human resources.

3.As participants were especially looking for reliable information, how could reliability of the information source be measured in Social Networks?

There are some ways to measure the reliability of the information source in Social Networks:

Knowledge on the topic: Its means how informative the answer and contributed to overall knowledge.
Ties Strength and Overall Knowledge: Its means how much does this answer contribute to your overall knowledge.

Ties Strength and Value: How much participants trust specific answers is correlated Positively with tie strength.

Ties Strength and Already Known Information: Friends with wildly different tie strengths.

Games with a Purpose (GWAP)

  1. Which mechanisms do the games include that motivate people to “think alike”?

As the games are directly related to human psychological needs and behavioral patterns, they are becoming powerful tools for achieving goals. Online games are such a method for encouraging people to participate in the process. Such games constitute a general mechanism for using brain power to solve open problems.

“Games with a purpose” have a vast range of applications in areas as diverse as security, computer vision, Internet accessibility, adult content filtering, and Internet search. Two such games under development at Carnegie Mellon University, the ESP Game, and Peekaboom demonstrate how humans, as they play, can solve problems that computers can’t yet solve.

Mechanisms do the games include that motivate people to “think alike” combined using the following two principles:

Define difficult but achievable tasks: Game offers many short-term goals, achievable, to maintain commitment.

Define clear objectives and game rules: Game provides clear objectives and well-defined rules of play to ensure that players feel empowered to achieve goals.

While the concept of Games with a Purpose (GWAP) can be simple, consists of goals, actions, tokens, feedback, a rule system, challenge, and the user’s skill with the following meaning attached to them. Goals express a certain game state the player wants to achieve. Actions determine what a player can perform to approach his goals. Tokens describe the entities a player can act upon; their configuration represents the game state. Rules refer to the algorithms determining the effects of the player’s actions on the game state. Feedback stands for information by which the game informs the player of its current state in response to his actions. Challenge refers to the central skill that has to be mastered.

2. Explore the site http://ajapaik.ee/and note some game mechanics that make this GWAP work

According to this website, (Ajapaik.ee) explore the history of the neighborhood and rephotograph the views for the future generations, and allows you to see the change throughout the time.
Recognize the old photos and place them on the map where you think they have been taken. How many photos can you locate? The closer you get to the correct vantage point the more points you earn. Get extra points for reshooting the view. This is a basic puzzle game.

https://ifi7167socialcomputing.wordpress.com/2018/09/07/assignment-1-topics-on-social-computing-1/

Design a site like this with WordPress.com
Get started